Genetic diversity in Cameroon: an invisible obstacle in human genetic studies of malaria
Kevin Esoh is a bioinformatician and DELGEME Masters Fellow based in Cape Town, South Africa. The Developing Excellence in Leadership and Genetics Training for Malaria Elimination in sub-Saharan Africa (DELGEME) research program is supported by the Developing Excellence, Leadership and Training in Science in Africa (DELTAS Africa), a programme of the African Academy of Sciences. DELGEME is one of the 11 programmes funded through the African Academy of Sciences’ DELTAS Africa programme. DELTAS Africa funds collaborative networks/consortia led by Africa-based scientists to amplify Africa-led development of world-class research and scientific leaders on the continent, while strengthening African institutions. DELTAS Africa is implemented through AESA (The Alliance for Accelerating Excellence in Science in Africa), a funding, agenda setting and programme implementing platform of the AAS in partnership with the African Union Development Agency (AUDA-NEPAD) and with the support of Wellcome and the UK’s Department for International Development (DFID).
Malaria continues to kill almost half a million people globally every year (405,000 in 2018). A great majority of these deaths are in children under age five (67%), particularly in sub-Saharan Africa, which bears 93% of the global malaria disease burden (WHO, 2019). There is currently no approved vaccine against malaria, and first-line drugs (artemisinin derivatives) are inconsistently effective. Furthermore, the mosquitoes that transmit malaria are becoming resistant to the insecticides that are used to control them. As a result, the world has seen a slowdown in progress against the disease for the past three years.
Without an effective vaccine or a more effective drug, there may be a reversal in prevention and treatment of malaria and death. For instance, severe malaria accounted for up to 22% of all deaths reported in health care facilities in Cameroon in 2017; 10% of these were children under five (Severe Malaria Observatory, 2018). Should resistance to artemisinin-derived drugs that is currently plaguing South East Asia spread in Africa, these numbers will rise. Therefore, continued research into more effective strategies to combat the disease is needed.
Over the past decade, studies like the Genome-wide Association Study (GWAS) have proven valuable in probing mechanisms by which the malaria parasite, Plasmodium spp., evades drugs and vaccines. Although these studies have led to the discovery of some important molecules used to fend off the parasites in humans, the more heterogeneous the population studied (that is the more ethnic groups or subgroups there are in the population), the more difficult it gets to find these molecules. Genetic diversity in Africa is the greatest in the world, which means that many more molecules relevant to malaria prevention and treatment are yet to be discovered. The genetic diversity of specific African populations is furthermore understudied on the Continent. Cameroon, the world’s most culturally diverse population may have significant genomic differences among its ethnic groups that may be biasing genetic studies. These studies usually involve sampling large numbers of unrelated individuals, usually from two groups in the general population; a group with the condition or disease (called the case group) and another without the condition (called the control group). This is the typical case-control study. The strength of such a study design depends on how similar the two study groups are (their homogeneity). This is important because the more similar all other factors that can drive genetic differences among individuals in the two groups, the more robust the findings about the condition being investigated. Because African populations have many factors that are associated with genetic differences among individuals, like language, culture, religion and eco-geographic barriers, coupled with the fact that Africans are the oldest population of anatomically modern humans (Uren et al., 2016), a homogeneous sampling is almost never achieved.
Despite attempts to address heterogeneity in genetic studies on the Continent, (Band, Rockett, Spencer, & Kwiatkowski, 2015; Malaria Genomic Epidemiology Network, 2019), these studies continue to underperform in Africa. Understanding the specific factors that account for significant genetic differences and similarities among Cameroonian individuals will be essential to inform sampling and analysis designs that are necessary to increase the power of current methods used to study the population.
Description of Study
This project seeks to characterize the genetic differences and similarities of Cameroonian populations to understand how genetic diversity within the population may account for changes in the DNA that lead to disease conditions (mutations). The project further screened for such mutations in individuals with malaria (cases) and individuals without malaria (controls) in the population.
In the initial phase of the study, blood samples were collected from individuals who had malaria and visited hospitals in the South West, Littorale and Centre regions of Cameroon between 2003 and 2008 after obtaining ethical approval and participant consent. Blood samples were also collected from healthy individuals and participants at primary schools. DNA was extracted from these samples at the Malaria Research Laboratory of the University of Buea in Cameroon (Achidi et al., 2012) and shipped to the Malaria Genomic Epidemiology Network Oxford Resource Centre in the United Kingdom for further processing. The genomic data, consisting of 1471 samples and 2.3 million single nucleotide polymorphisms (SNPs), was then returned for bioinformatics analysis.
Figure 1. Single Nucleotide Polymorphism (SNP): a change in the DNA of one or several individuals at a single position.
Bioinformatics algorithms were used to measure genetic relatedness (such as co-ancestry estimation, genetic distance estimation, and population admixture) to determine the genetic diversity among three Cameroonian ethnic groups: Bantu, Semi-Bantu, and Fulani or Foulbe. This gave researchers a means to estimate the ancestral history of these ethnic groups based on their genetic makeup, and how close or distantly related they are to each other genetically. Because these ethnic populations have cohabited for a long time, it is evident that they have shared their genetic makeup with one another, therefore making the general population highly mixed (or admixed). We thus measured the level of genetic mixture among the ethnic groups. These measures were also repeated comparing Cameroonian ethnic groups with other populations around the world whose genetic data has been made freely available by The International Genome Sample Resource (https://www.internationalgenome.org/) (Altshuler et al., 2010). This comparison was important because the populations represented in this data base, (known as the 1000 Genomes Populations or simply 1KGP) are frequently used as reference in genetic analyses around the world.
One reason that reference populations are useful is because of the relationship between malaria and Sickle Cell Disease (SCD). SCD is known to contribute the strongest genetic effect to malaria in sub-Saharan Africa in that carriers of one copy of the mutation that leads to the SCD enjoy significant protection from severe malaria (Kariuki & Williams, 2020). in fact, studies have shown that the SCD disease arose because of the pressure malaria imposed on the human race more than 20 thousand years ago (Laval et al., 2019). This is a typical natural selection process.
Reference populations are useful in the study of natural selection processes. For example in determining the factors that cause some people who carry one copy of the sickle cell gene in sub-Saharan Africa to resist severe malaria (Kariuki & Williams, 2020). And in also investigating how old some mutations (like the sickle cell mutation) (Laval et al., 2019). In such studies, information about the ancestral state of the mutation (the original DNA nucleotide that was present before the mutation ever occurred) is vital. Although ancestral states of many mutations have been determined and curated, a common challenge in African population genetics remains determining which of the African populations is the oldest on the ancestral tree in order to use it as the base population (out-group) when assessing natural selection. By comparing our study population to these reference populations, and gathering information from previous studies, we were able to identify an African population that has genetic material in the public database, the Mende tribe from Sierra Leone (MSL) that could serve as an out-group in genetic studies on the Continent (Skoglund et al., 2017). Identification of the MSL as an out-group also enabled the mapping of regions where diseases and changes in diet contributed to natural selection of different ethnic groups.
GWAS was also applied to screen for genomic variations in individuals with and without malaria to identify DNA and protein units that may serve as candidates for vaccines or drugs, or just contribute to understanding the malaria disease process.
The extensive genetic diversity of African populations usually leads to undesirable false discovery (false positive) rates (FDR) (Teo, Small, & Kwiatkowski, 2010). A high FDR is common when a significant genetic difference is observed among individuals due, for example, to ethnicity or geographic location, rather than the effect being studied (e.g. malaria). High FDRs are associated in populations, such as in Africa, with significant cultural, linguistic, ethnic, geographical and ecological differences. This means population genetic studies in individuals with African ancestry require careful design and planning to account for these undesirable effects as compared to studies involving other ancestral populations. We have generated substantial information from our population genetic study that could benefit similar studies on the Continent.
Our study has shown, for the first time, the fine-scale genetic relatedness of the Fulani population of Cameroon with their Bantu and Semi-Bantu counterparts, and that ethnic groups form separate clusters that may adversely affect genetic association studies of, for example, malaria in Cameroonian individuals. This suggests that ethnic-based approaches for data analysis in highly structured African populations may be the most effective approach for genetic association studies on the Continent.