Genetic architecture of the quantitative traits is an important aspect for understanding the sustained response of selection and also for predicting the disease risk. In regards to the distribution of effect of variants underlying the quantitative traits, Fisher (1918) proposed a simple explanation, termed as the infinitesimal model, where the quantitative traits are composed by very large number of unlinked loci, each of very small effect and add up in a linear way. Under this model, selection response can be sustained for a long time, since the selection does not cause substantial change in gene frequency and also not reduce the genetic variants. However, genome wide analysis often found genetic variants which have considerable effect on the quantitative traits, contrast with the prediction from classic Fisher’s infinitesimal model. Since the model was constructed long before the genomic revolution, the question is then whether the model is still relevant today based on pattern that we observe from genomic data. In this review, I will discuss the issue with the emphasize on the theoretical model of the allelic effect distribution, the contribution of mutation on the long term selection response and on the most plausible explanation for missing heritability.
The genetics of adaptation: from Fisher’s Geometric to Orr’s Exponential Model
Theoretical model to understanding the range of effect beneficial mutations fixed by natural selection is crucial to determine the allelic effect underlying the quantitative genetics variations. One of adaptation mechanism is through process called as micromutationism which views adaptation as a gradual process and no possibility of big leap of beneficial changes. Based on this perspective, Fisher proposed a model of adaptation called as geometric model. In his model, Fisher imagined that organisms consisting of multi-dimensional sphere and the fitness increase smoothly inward until maximum fitness value in the centre of the sphere. Initially, population starts from the outer of sphere with distance d (perimeter of the sphere) and mutation takes the phenotype into random direction with the effect (distance) r. The mutations have higher chance bringing the phenotype closer to the optimum fitness at centre of sphere (thus beneficial) if r is less than d. Thus, Fisher’s geometric model implies that adaptation is mediated by successive gradual steeps, so that mutations with infinitesimal effects have higher chance to be advantageous.
However, Fisher’s geometric model neglected the fact that initially, mutations are only found in several organisms. Therefore, mutations are sensitive to stochastic process and the mutational effect becomes important factor to escape for that random loss. Mutation with large effects have relatively higher survival probability than minor effect mutations. Taking account for these two factors, Kimura revised the Fisher’s geometric model by stating that mutation with moderate effects are the ones that contribute the most to the adaptation.
Orr extended the idea by considering the distribution effect of mutations responsible for entire bouts of adaptation, since the magnitude of the effect might be different thorough steps of adaptations. Using analytical theory and computer simulation, he suggested that during adaptation, the mutational effect follows exponential distribution, where it was dominated by a few mutations with large effect in the early stage and then followed progressively by large number of mutations with smaller effects. The exponential pattern can be understood by the concept of adaptive space, where at the start of adaptation there are still many spaces for random mutation to be tested again selection, but the space to the centre (optimum fitness) become limited as selection proceed. Latest work by Martin and Leonard (2008) using multivariate quantitative genetics results in beta distribution as the generalization of allelic effects distribution and approaches to the same Orr’s exponential distribution when character spaces are large.
The exponential pattern has been reported in a large body of QTL study including in skeletal elements regulation of stickleback fish (Kingsley et. al, 2004), flower colouration in monkeyflowers (Schemske et al., 2003) and yeast sporulation (Cohen & Lorenz, 2012). However, the magnitude of the effect in QTL study is often exaggerated and subject to the resolution of the mapping. Analysis in finer resolution of tb1 loci which was identified before as a large effect QTL controlling ear morphology shown that the QTL fractionates into five smaller multiple linked loci (Doebley & Studer, 2011).
Two dimensional view of Fisher-Orr Adaptation model. Light blue arrow shows a mutation with large effect (r > d) and resulting in drop of fitness. Orr suggested that the effect of mutation has an exponential pattern where it is dominated by few genes large effect (dark brown) and then followed by many genes with small effect (red, green, yellow) until reaching the fitness optimum.
The importance of mutational variance on the long term selection response
Data from Illinois maize divergent selection experiment (which started from 1890), showed that selection response on high oil content was still observed even after 100 generations (Duddley, 2005). The simple explanation for the continuing response is that the trait is controlled by numerous variants with very small effect. However, the response could also come from mutation with large effect which can directly be picked up by selection. This view receives little attention, after Frakham (1980) found that substantial response in the selection of Drosophila abdominal bristle is mediated by mutation at the bobbed locus which was not present at the standing genetic variation in the base population. In addition, Barton & Keightley (2001) shown that selection response beyond 20 generations are increasingly dependent on the input of mutational variance.
The selection responses based on the infinitesimal model and from large-effect mutation have several different properties. Large effect mutation model subjects to the random sampling of large mutation which causes response to be uncertain and erratic. Furthermore, in the long run, selection responses from both models will reach an asymptotic response (at the balance between mutational input and loss by drift), but the asymptotic response is predicted to be in earlier generations and has strong dependency on the population size under large-effect mutational model. A study by Keightley (1997) on divergent selection in inbreed mice found an episodic and asymmetric response which is suspected due to contribution of large effect mutation.
Does the missing heritability point to the infinitesimal model?
Missing heritability is a phenomenon where the identified genetics variants associated with quantitative trait cannot fully explain the heritability inferred from population. For example, the largest GWAS meta-analysis study to date (using almost 400000 individuals) on the trait related to obesity found 97 loci associated with BMI, but can only explain about 2.7% variation from 60% estimated heritability (Locke et. al, 2015). Theoretically, the proportion of heritability explained by known genetics variants relative to total heritability (including the undiscovered variants) is defined as the ratio of . Thus, based on the ratio, there are 2 possible source of missing heritability which are low and overestimation of .
If the missing heritability is due to the numerator one of the reasoning is that probably still there are many small effect genetics variants which are leaved out during analysis. In association analysis, traits are tested to all variants in the genome. The effect of multiple testing is that it will inflate the rate of false positive. In order to reduce the effect of finding association only by chance, stringent significant level is usually applied in analysis. However, the use of strict significance threshold comes with a cost of reducing the power of the test, so that many variants with small effect might not be accounted for. If the lack power is problem, additional significant genetic variants can be found by increasing the power of the test by using larger sample size. Increasing sample size have been proved to find more genetics variants in the research of human height which was initially 180 variants by using 180000 individuals (Ellen et al., 2010) and subsequent research found about 500 extra genetics variants by using larger sample size (250000) (Wood et al., 2014).
Another potential source of low is because of contribution of rare allele with large effect which cannot be picked up in current association study which designed for the common variants. However, presence of minor allele with high penetrance did not align with the result of study by Yang et al. (2010) on human’s height and Cogni et al. (2015) on Drosophila antiviral resistance.
Height is a classical quantitative genetic trait in the human, but despite much effort, the latest finding showed that cumulative effect of significant variants explains less than 20% of heritability of 80% expected heritability (Wood et al., 2014). Yang et al. (2010) took different statistical approach in analysing possibility of contribution of rare variants, where they used mixed-model approach which allow for the incorporation of all common variants. The cumulative effect of all common variants explains 45% of heritability and the remaining heritability is proposed due to imperfect LD between analysed variants and true causal variants, which leave little space for contribution of rare variants. In addition, Cogni et. al (2015) used multi-parent advance inter-crosses of Drosophila which can push the rare alleles to the intermediate frequency, but still that the missing heritability exists.
Zuk et. al (2012) provides alternative view in understanding about the missing heritability. Instead of focusing on the numerator, they argued that cause of missing heritability is perhaps due to overestimation of denominator. is estimated indirectly from population data, under assumption that the causal loci add up in additive way. However, there are also possibility of epistatic interactions which will generate deviation from linearity. Neglecting epistatic effect results in the inflation of called as ‘phantom heritability’, and consequently will be underestimated.
Thus, under ‘phantom heritability’ hypothesis, the missing heritability is presumably not caused by not finding enough variants, but due to the use of wrong statistical model under assumption of complete additivity. They proposed a new model called as limiting pathway (LP) model which taking account for epistatic interaction. Re-analysing known genetic variants in Chron’s disease, they showed that under LP model, the known variants account for markedly increase from previous analysis under additive model at . Another interesting finding is that 80% of missing heritability is calculated due to the genetic interactions.
However, to what extent that epistatic interactions contribute towards quantitative variation is still a matter of controversy. The large contribution of non-additive genetic variance is seriously challenged by theoretical study by Hill (2008) and empirical evidence by Bloom et al. (2013 & 2015).
Bloom et al., (2013) used 1080 segregants of yeast (Saccharomyces cerevisae) to quantify the level of genetic interactions. The advantage of studying genetic interactions using yeasts is that due to haploidy, dominance variations absent from total genetic variation. Thus, under uniform environmental condition, the board sense heritability contains only additive genetic variance and epistatic variance. By taking the differences of board sense heritability () with (narrow sense heritability, which consisting only the additive genetic variance), the level of gene-gene interactions can be quantified. The result shown that genetic interactions typically contributes less than quarter of additive genetic variance. The same result was obtained, even after extended study (Bloom et al., 2015) using larger sample size (4390 segregants) to increase the statistical power.
From theoretical standpoint, Hill et al. (2008) stated that analysis of genetic interactions at the molecular level cannot always be regressed to the epistatic variance at population level. Hill et al. (2008) estimated the proportion of genetic variance under several allelic distributions with neutral model as the reference point. In neutral model, equilibrium allele frequency is determined by the joint action of drift and mutation which leads towards U-shape distribution. They estimated that when the allelic frequency distributions have higher density at the extreme value (near 0 or 1), most of the genotypic variance is additive, regardless the level of dominance or epistatic at gene level. Hill et al. (2008) also summarized the correlation of monozygotic () and dizygotic () of Australian twin study, and showed that is approximately twice of , indicating the additivity is the main determinant of the genetic variance. Therefore, it can be inferred that, even though the evidence of epistatic is pervasive as the mechanism of gene action at the individual loci, the same does not apply for the level of epistatic variance at the population level.
Thus, it can be concluded that, up until now, genetic architecture of quantitative traits with numerous loci and each of minor effect is the most parsimonious and convincing explanation for missing heritability, which point to the validity of the infinitesimal model. However, major effect variants are increasingly appreciated have contribution for sustained selection response and in the exponential model of allelic distribution during adaptation, but they tend to be rare.