Calculation of expected and observed heterozygosity #703
Replies: 4 comments
-
(Posted by @timothymillar) The observed heterozygosity is a measure of heterozygosity in the real population and is typically estimated from a sample of multiple individuals of that population. In polyploids the concept of heterozygous or homozygous is less binary because the additional allele copies can result in a spectrum of heterozygosity. Hardy (2016) defined the individual heterozygosity
In diploids this metric will be 1 in a het call (AB) or 0 in a hom call (AA). In terms of API the individual heterozygosity can be easily calculated from The observed heterozygosity of a sample is simply the mean of individual heterozygosities where there are N individuals in the sample. |
Beta Was this translation helpful? Give feedback.
-
(Posted by @timothymillar) The expected heterozygosity is the expected rate of heterozygosity if a given population is in HWE. Nei and Roychoudhury (1974) give a formula for expected heterozygosity when the true allele frequencies are known for a population where where M is the total number of allele copies in the sample e.g. 2N for a diploid sample. A limitation of this method is that it assumes that all individuals in the sample are non-related and non-inbred. Nei and Chesser (1983) introduced a method of calculating expected heterozygosity which attempts to correct for the bias due to multinomial sampling of genotypes where Hardy (2016) adapted this method to polyploids where k is the population ploidy. Harris and DeGiorgio (2016) provide an alternative method heterozygosity BLUE that can calculate expected heterozygosity in the presence of related and/or inbred individuals of any ploidy. Their method takes a kinship matrix which is used to weight allele frequencies based on relatedness among individuals where Heterozygosity BLUE is identical to the Nei and Roychoudhury method when the kinship matrix indicates non-related and non-inbred individuals (i.e. the kinship matrix is the identity matrix divided by ploidy). |
Beta Was this translation helpful? Give feedback.
-
Thank you @timothymillar this detail is very useful for those of us without a strong background in population genetics! On a related note, I collected some links based on an email that @alimanfoo sent me and crafted a post on https://discourse.pystatgen.org/t/genome-wide-selection-scans/90 to help us come up to speed on this topic. Please do pass along any additional details specific to your work that we can read up on! |
Beta Was this translation helpful? Give feedback.
-
(Posted by @timothymillar) Thanks @hammer
Will do, but I'm still figuring out the specifics! |
Beta Was this translation helpful? Give feedback.
-
(Posted by @timothymillar)
I wanted to start some discussion on the calculation (and API) of expected and observed heterozygosity which are used in estimating inbreeding (F) and fixation indices (FST). These metrics are implemented in scikit-allel and could be generalized further to support polyploid data and (in the case of expected heterozygosity) related samples.
Beta Was this translation helpful? Give feedback.
All reactions