LifeNet Health LifeSciences now offers genotyping information corresponding to the donors of primary human hepatocytes. The genotyping includes the analysis of 112 gene variations associated with Phase I enzymes (cytochrome P450 isozymes) and Phase II enzymes (detoxification and transporter proteins). The actual sequencing call is provided as well as the allele frequency and reference allele matched to the donor’s ethnicity.
Genotyping information is important in experimental designs for compound discovery and development. It can be applied to studies of metabolism rates, drug-drug interactions, disease models, diversity in preclinical investigations, and 3D model and co-culture systems.
Three Different Classes of Genotypes
Single nucleotide polymorphisms (SNPs): SNPs are single base variations within the gene and are the most common class. They can encompass sequencing calls with different combinations of the reference and alternate allele. The reference allele is the DNA base which is most frequently representative of a population while the alternate allele is the less frequently represented DNA base. For example, if the reference allele is A and the alternate allele is G, the combinations may be A/A, A/G or G/G. The first, A/A, is the most frequent homozygous genotype for that population. The second, A/G, is the heterozygote. The third, G/G, is the least frequent homozygous genotype for that population. The genotype is written to signify both bases inherited from the maternal and paternal alleles.
Frameshift Mutations: These include insertions and deletions. These genotypes may be designated in terms of sequencing results as: T/- or -/G. In the first example, the reference allele is T and the deletion is referred to as the absence of T or ‘-‘. In the second example, the reference allele is the absence of G or ‘-‘ and the alternate allele is the insertion of G.
Copy number variations: In the area of ADME and toxicology, copy number variations tend to be mostly associated with metabolism rate phenotypes and are designated with the number of copies present for that gene variation. For example, an ultra-rapid metabolizer may be designated as CYP2D6*2X3, where there are three copies of this particular SNP. In the latter example, the SNP is designated using the Star nomenclature as discussed below.
Different Nomenclature Systems
The HGVS or dot nomenclature:
The dot nomenclature stands for a system developed by the Human Genome Variation Society and is referred to colloquially as “c.” [C dot], “g.” [G dot] and “p.” [P dot].
The (“c.”) stands for a designation for cellular DNA. For example, a variation found in cellular DNA could be “c.82T>C.” This system describes a transition of T to A at DNA base pair 82. Gene sequences are written in the 5’ to 3’ direction with base pair 1 being the start. This particular SNP is found at the 82nd base pair in the gene sequence starting from the 5’ start.
The (“g.”) stands for genomic DNA. As more and more gene sequencing information has been accumulated and added to the various drafts of the human genome, some designations have used the genomic position as the starting position. For example, a variation found in genomic DNA could be “g.10345G>A” which depicts a G to A transition at genomic position 10345.
The (“p.”) stands for a change within the protein sequence. The example of “p.Ile437Arg” would designate an amino acid change from isoleucine to arginine at position 437.
One issue resulting from use of the Dot nomenclature is that as the draft of the human genome was more filled in, the “c.” designations seemed to shift in the noted base pair numbering. So, what was referred to as “c.82T>C” may now in actuality be “c.80T>C” or “c.87T>C”. Another issue is that a good portion of the Dot nomenclature was designated using the antisense strand. In this case, “c.82T>C” may actually be “c.82A>G.”
The Star nomenclature:
The Star (*) nomenclature was proposed as an alternative system where each SNP was originally represented by a “*” added after the isozyme designation, as in CYP3A4*5. However, it was realized that several SNPs can act together and be “linked” along the allele. Now, the “*” designation refers to the haplotype. Therefore, a maternal/paternal allele designation for CYP3A4 may be *1/*5 where *1 is a fully functioning allele and the *5 most likely designates a not-fully functioning allele.
The largest issue with the Star nomenclature is how to classify a haplotype encompassing several SNPs with the appropriate metabolic rate phenotype (extensive [normal], intermediate, or poor). To date, there are several software tools which have been programmed to make this classification. In addition, Gaedigk et. al. proposed an activity scoring system in 2008. It confers an activity score for each SNP variant. The scores for all SNPs in a haplotype are added up and compared to an activity scale to designate a phenotype.
The rs number nomenclature:
The last nomenclature is the “rs number.” This designation is a unique number which correlates one-to-one to a gene variant. For example, rs5030867 is a gene variant (T>G) in CYP2D6. The rs numbers are used throughout the SNP database of the National Center for Biotechnology Information.
Reference Alleles, Alternate Alleles, and Allele Frequencies
For each gene variant, there is a reference allele and an alternate allele. In the past, the reference allele was also referred to as the ancestral allele. The reference allele is the base which is most frequent in a population, whereas the alternate allele is the less frequent base. As always, insertions and deletions will be represented by 2 or more base combinations. An example is the frameshift variant, rs4646438, of CYP3A4, where the reference allele is T and the alternate allele is TT. In terms of a sequencing call which notes the presence or absence of a base, the reference allele is “-“ and the alternate allele is T.
How often a base for a gene variant is observed in a population is the allele frequency. For example, rs4149032 of the transporter SLCO1B1 has a reference allele of C and an alternate allele of T. The frequency of C in a European Caucasian population is 0.660330 and for T is 0.339670. The two frequencies should add up to 1.000000. The number of significant figures gives an indication of the confidence in the stated frequencies.
However, if the donor was African American, the reference allele is T with a frequency of 0.6543 and the alternate allele is C with a frequency of 0.3457. The reference allele is “flipped” for this ethnicity. In this case, the most frequent base is T in the African American population.
This last example is an illustration of why the ethnicity of the donor is so important. In the European Caucasian population, 2/3 of the population will have C and only 1/3 will have T. However, in the African American population, the opposite is true, 2/3 of the population will have T with only 1/3 having C.
The ethnicity of the donor is very important to take into consideration with respect to drug metabolism. In the case of rs4149032, African-descent populations, due to the high prevalence of T, have a lowered bioavailability of rifampin, a drug used to treat tuberculosis (TB). In those individuals who are homozygous for T or are heterozygous, it has been suggested that a higher recommended dose of rifampin be used to treat their TB.
Applications of Genotyping Information
Genotyping information can be essential to drug discovery and development studies. It may be used in studies of metabolism rates, drug-drug interactions, disease and medical conditions investigations, diversifying patient populations in preclinical studies to mimic clinical studies and in decisions of which primary human hepatocyte lots should be used for 3D model systems and co-cultures.
For disease modeling, using primary hepatocytes of different genotypes can help mimic different rates of progressions of various conditions, such as NAFLD or hepatocellular carcinomas. Hepatocyte lots of different genotypes may help in studies of complex drug treatments by observing the effects of a group of drugs versus the single effect of each drug. Suspected ethnic susceptibilities may be observed in mimicking the patient population stratification in vitro by using hepatocyte lots with very diverse genotypes.
LifeNet Health LifeSciences Advantage
Primary human hepatocytes available through LifeNet Health LifeSciences have a variety of characterization information already available: comprehensive donor medical and social history, histopathological assessment and histological images, plateability, optimal seeding density, phase I and II enzyme activity, and CYP induction. In addition, we provide along with the donor ethnicity, the actual gene sequencing call, the rs number, the allele frequency, and the reference allele for each of 112 gene variants related to ADME and toxicological investigations.
This content was authored, reviewed, and approved by Ph.D. scientists. March 15, 2021.
Gaedigk, A., et. al. Ten years' experience with the CYP2D6 activity score: a perspective on future investigations to improve clinical predictions for precision therapeutics. J. Pers. Med. 8(1): 1 (2018).
Chigutsa, E., et. al. The SLCO1B1 rs4149032 polymorphism is highly prevalent in South Africans and is associated with reduced rifampin concentrations: Dosing implications. Antimicrob. Agents Chemother. 55(9): 4122- 4127 (2011).