The Science of Everything Podcast - Episode 146: Mendelian Genetics and Inheritance

Starting point is 00:00:33 you're listening to the Science of Everything podcast, episode 146, Mendelian Genetics. I'm your host, James Fodor. In this episode, we're going to discuss the inheritance of genetic traits, focusing on classical or Mendelian genetics. So this refers to the set of theories and methods that were developed in the kind of late 19th through early 20th century, before the molecular revolution, before we understood the structure. of DNA and we're able to manipulate DNA at the molecular level. In particular, we're going to discuss the inheritance of single traits and talk about Mendel and his pea plants and we'll discuss sex-linked traits and then we'll gradually build up and look

Starting point is 00:01:16 at more complex forms of inheritance that don't obey all of the simple laws that Mendel derived, including the inheritance of multiple independent traits, then we'll look at modification of Mendelian ratios, including incomplete dominance, codominence, and epistasis, and penetrance and expressivity, and then we'll conclude by talking about the inheritance of complex traits, looking at quantitative traits and heritability. There isn't really any recommended pre-listing for this episode, because often, in fact, Medellian genetics is covered in courses before molecular genetics. We kind of, in this podcast, did it the other way around, though having a little bit of

Starting point is 00:01:53 background about molecular genetics certainly wouldn't hurt, so you could look at episode Episode 34 and 35, where I talk about DNA, structure and function. All right, so let's jump straight in and begin by talking about the inheritance of single, simple traits. So first of all, let's introduce some key terminology that will be important throughout this episode. A trait is just a characteristic of an individual. So something that makes one individual distinctive from another individual. Some traits are heritable, some are not. So height would be an example of trait, eye color, hair color, weight.

Starting point is 00:02:27 etc. So some of those were going to be heritable, some not, and some are heritable to some degree, right? So a trade is just any characteristic, typically an observable characteristic of an individual. The genotype is the genetic constitution or genetic makeup of an individual. So the genotype is unique for each individual, except for identical twins or clones. And you can think of the genotype as the set of all of the genes and other genetic material that an individual has. It's the full set of genes. The phenotype is the set of all observable traits of an individual. The phenotype is produced by an interaction between the genotype and the environment.

Starting point is 00:03:05 So the phenotype is kind of like a set of all traits, observable traits that an individual has. Genotype is the set of all genes, genetic material generally. The phenotype is determined by genotype plus environment. Another important concept is wild type. So this is sort of typically used to describe plants and animals. The wild type is the normal form or the typical form, whether that's phenotype or genotype, that's found in the wild, hence the name. It's distinguished from various mutants or artificial variations that have been selectively bred

Starting point is 00:03:40 or produced by genetic engineering. Because Mendel's experiments were conducted in pea plants and much genetic research has been conducted in plants, it's important to understand that many types of plants, well, flowering plants in particular, can breed in different ways. They can crossbred or they can be selfed. Crossbreeding a flowering plant would mean taking the pollen from a flower of one plant and putting it on the stigma of a different plant. What we would think of as normal mating, you make two different individuals together with each other, right?

Starting point is 00:04:17 But plants can also self-fertilize. So that's when you take the pollen from a flower on one plant and put it on its own stigma. So that's called selfing, essentially self-fertilization. And that results in genetically identical offspring because you're not crossing genetic material from different individuals. You're combining genetic material from the same individual. Now, let's talk about Mendel's experiments. So Mendel was actually a monk who lived in Moravia in the late 19th century,

Starting point is 00:04:47 but he sort of pioneered this type of genetic research, and he worked with pea plants. he chose seven characteristics of pea plants, the height of the plant, the pod shape, the pod color, the seed shape, and the seed color, and the flower position and color. Now, Mendel was either very clever or very lucky or sort of both in choosing these traits, because it turns out that these traits are all binary traits, which means that there are only two different phenotypes for each of the traits. So yellow or green color, rough or smooth seeds, etc. Tall or short plants, etc.

Starting point is 00:05:19 This is important for understanding the results that he derived because he's only using traits that have sort of two possible phenotypes or two possible versions. Now, he wanted to know, Mendel wanted to know, why do different plants have different traits? Why do seed type and shape and color and height and so forth to differ between one plant versus another? And also, how are such traits passed on onto the descendants of a particular plant? How is that affected by crossbreeding, sulfing, and things like that? So he developed this method where first he produces a what's called a true breeding line, which just means that it's an artificially bred line with genetically identical offspring. So you get that just by taking a particular individual and selfing it for multiple generations.

Starting point is 00:06:02 So there should be very little genetic variation within that generation if you've selfed it repeatedly and sort of removed any variation. So he started off with what's called the parental generation. This is a genetically identical set of plants. Then he took two parentals from which had different traits. So for example, yellow peas versus green peas. So he takes one individual from each of these two different true breeding lines, which together form the parental generation, and he crosses them together.

Starting point is 00:06:34 So he does this many times, right? He doesn't just do it with one plant. He takes one green plant, one yellow plant crosses them. One green, one yellow crosses them. So he does it for a whole bunch, but together they form the pea generation or the parental generation. The offspring of the P generation are called the F1 generation, F stands for filials, it just sort of relates to families, right? An F1 because it's the first filial generation following the parentals.

Starting point is 00:06:59 This is nomenclature that's widely used, so it's sort of important to understand. Now what he found, this is sort of one of the first interesting findings, is that all F1 plants had the same phenotype. So they were all the same when it came to the trait. So if the parentals were green and yellow, he found that all of the F1 generation were green. Likewise similarly for the different traits, height, pod shape, color and so forth. He's doing each trait separately, right?

Starting point is 00:07:26 He's just looking at one trade at a time. Then what happened is that he takes members of the F1 generation, crosses them together to produce the F2 generation. Okay, so there's three generations here, Parenthood, F1 and F2. So as I just mentioned, the first interesting finding was that all of the F1 generation had the same phenotype. in the color case, the pods are all green. So that's interesting because the parentals were different.

Starting point is 00:07:50 One was green and one was yellow, but their children, their offspring are all green. So what happened to the yellow phenotype? How come we don't see that? Well, actually, what he found is that when you cross members of the F1 generation together to produce their offspring, you do see yellow pods. He called us the re-emergence of that phenotype in the F2 generation.

Starting point is 00:08:11 So there's a trait that, in a sense, skips the generation, right? it's seen in the parentals, not in F1, but it is seen in F2. And there's more to it than that, because there's a particular mathematical structure to how he observed the traits. So, again, the parentals have green pods and yellow pods, right? One of each. F1 generation, we only see green pods. We don't see any yellow pods.

Starting point is 00:08:34 So 100% of the phenotype in F1 is green. In F2, what we'll see is a 3-1 ratio of phenotypes. So three times as many green to every yellow. And that always holds, again, for all of the traits that you looked at, I'm just talking about the pod color example, but this holds for the other traits that you looked at too. In all of these cases, whichever trait it was that appeared in F1 was also the more common trait in F2, but he also always did see this 3 to 1 ratio of the trait that was absent in the F1 generation reappearing, re-emerging in a 3 to 1 ratio in the F2 generation.

Starting point is 00:09:12 So these are very interesting results, and how did Mendel explain these results? Bear in mind that Mendel didn't understand anything about DNA or genes or even chromosomes. But what he postulated is that heredity is determined by unobserved factors, he called them. We would call them genes, but he didn't know about genes, that come in different forms. We would call these alleles, and allele is a variant of a gene, usually due to one or more mutations, right? But Mendel talked about factors and forms of those factors. Now the form of each factor determines the phenotype of the organism. So this was quite insightful because he basically discovered the existence of genes,

Starting point is 00:09:48 though not the molecular basis of them, just by looking at simple breeding experiments with pea plants. So that's pretty cool. The second postulate that he made is that for each phenotype, an organism always inherents two genes, one from each parent. In true breeding lines, like the parental generation, these factors are identical to each other, while in hybrid plants,

Starting point is 00:10:08 so those bridges by crossing two peel, bearding lines, the inherited factors differ from each other. And this is how he explained the reemergence of the yellow trait in F2 generation compared to F1 generation where it wasn't observed, because he postulated in the P generation, the parentals, both of the copies of the color factor that each of the parents had was the same. So the green plant had two green factors and the yellow plant had two yellow factors, right? That was his postulate. And then what he reasoned was that each of the offspring inherits one factor from each parent. So in the F1 generation, because all of the F1 offspring have one green and one yellow parent, each F1 generation plant will have one green factor

Starting point is 00:10:55 and one yellow factor. However, in the F2 generation, it's going to be different because, and this was his next postulate, the law of segregation, he said that factors, segregate independently of each other. What this means is if you think about it, in the F1 generation, all of the individuals have one green and one yellow factor, so they're all the same. But their children, so the F2 generation, will be inheriting each of their factors from their parents. And so some of the offspring in the F2 generation will get one green and one yellow, some will get two greens, and some will get two yellows. And if that inheritance goes independently, you actually expect there to be two times as many green and yellow as green, green, green and yellow, yellow.

Starting point is 00:11:41 The reason for that is, of course, because you can get green from parent one and yellow from parent two, or yellow from parent one and green from parent two. So there's like two ways to do it, whereas there's only one way to get green green, which is to get green from both parents and likewise from yellow. So you actually expect to see two times as many mixed offspring with one green and one yellow factor as the pure offspring, the green, green and yellow, yellow. But how does that explain the three to one ratio? Because so far, we've said, well, in the F2 generation, we're going to have three different genotypes effectively. Green, green, green, yellow, and yellow, yellow. That's three different genotypes, but we only see two different phenotypes. So

Starting point is 00:12:22 what's going on there? And we see the three to one ratio. We don't see a one to two to one ratio. Well, Mendel postulated the law of dominance, which was to say that, If two alleles differ, so if you've got a green and a yellow, one will dominate the other and have its phenotype fully expressed. So what Mendel said is that the green phenotype is dominant over the yellow one. The plants in the F2 generation that have green-green factors, well obviously those will have the green phenotype. And yellow-yellow factors, well, they'll have the yellow phenotype, that's sort of obvious. But Mendel postulated the law of dominance, which would say, well, if you have green and yellow, if you have one factor of each, then one of those will be dominant. And it turns out in the case of this particular trait,

Starting point is 00:13:07 the green version is dominant. So the green phenotype is dominant over the yellow phenotype, which means if you have green yellow, you'll show the green phenotype. And so that's how we get the three to one ratio, because green greens, well obviously those will be green, green yellows, which there are twice as many, will also be green. So so far I've got three greens. And then the last possible genotype, which is yellow, well, those will be yellow. So we get three. green phenotypes for every yellow phenotype. We can represent this relationship using a punnet square diagram, which is just a graphical representation of what I've been explaining, that you inherit two factors each from one of your parents, and that there's four ways,

Starting point is 00:13:48 four different genotypes that can be resulted from that. In this case, it's going to be green, green, green, yellow, yellow, or yellow, yellow, and then when you postulate that one of those traits will be dominant over the other, in this case it's the green, then you expect to see a three-to-one ratio of phenotypes because green green, green yellow, yellow green, all give green phenotypes, and then the only genotype that gives the yellow, the yellow phenotype is the yellow yellow genotype. So a punnet square just sort of summarizes that, and you would have probably seen this in high school at some point because it's widely taught. Now, I've been talking about factors so far, again, because that's the language that Mendel used, and that's what he understood. He didn't know about

Starting point is 00:14:27 genes. We know what happens to be DNA, but sort of theoretically it could have been something different, right? Now, we also know that, at least in humans and any other organisms, each individual has two copies of each gene, one from each parent. And so that is the reason why Mendel determined that there are two factors in each organism. Well, there's one copy inheritive from each gene. We know that happens during myiosis where the cell splits and it divides up its genetic material equally between its daughter cells. I've talked about that in previous episodes, so I'm not going to go through that at the molecular level. Another thing that we know is that the law of segregation, the random distribution of the factors from parents to children, occurs because of the random distribution of chromosomes during meiosis. So when gametes divide preparatory to reproduction, they randomly sort of mix up their chromosomes and they're split up between the daughter cells.

Starting point is 00:15:25 So the chromosomes are randomly distributed, which explains the random assortment of traits. We also know that what Mendel talked about as forms of a factor, we call alleles, as I said before, and alleles differ from each other due to differences in the series of nucleic acids on the DNA. But again, you don't really need to know about that to understand Mendel's findings, but that's just sort of what's happening at the molecular level here. Now, one thing I haven't explained, which is the law of dominance. So why is it that when you have two different alleles, so green versus yellow, one will dominate the other so that only one of those phenotypes is expressed. What's the

Starting point is 00:16:01 molecular basis of that? Well, the answer is that there isn't really a single molecular basis of that. In fact, very often, we don't see complete dominance. We see other types of relationships like incomplete dominance and codominence and epistasis, right? And I'll talk about those a little bit later in this episode. So the point is that this is why I sort of said before Mendel got lucky, right? He chose traits that were binary, so there's only two possible phenotypes for each trait. But he also happened to choose traits which all exhibited complete dominance. When you have two different alleles, one of those dominates the other. For many traits, that's not the case.

Starting point is 00:16:37 You don't actually see this. And it depends on the trading question and what genes are encoding for it. So Mendel just got lucky here because it meant that his results were more predictable and easier to understand than if he had chosen traits which didn't exhibit complete dominance. So to summarize at the outset here, Mendel discovered the... simple relationships between the traits of offspring compared to the traits of their parents, these traits emerge because of the inheritance of two copies of each gene from parents and the independent segregation of chromosomes during the process of meiosis. And also, in the case

Starting point is 00:17:15 of Mendel's initial experiments, at least, because these traits exhibit complete dominance where when you have a mix of two alleles, like one version of each allele, then one of the traits of one of those alleles will dominate those of the other, so you'll only see the phenotype of the dominant trait. I should also say here that I talk about having two different alleles. That's called heterozygous. It's a sort of fancy word for meaning, hetero meaning different, right? So you have different alleles for a single locus or a single factor, a single gene in the genome. Whereas homozygous means you have two that are the same. So in the F2 generation, there are two homozygotes and two heterozygotes. So two that will have the same alleles, so that's your green-green and your yellow-yellow, and then there'll be two heterozygot, so green-yellow and yellow-green.

Starting point is 00:18:04 And again, the Punnett Square is very useful for that. The punnet square shows the relationship between the phenotypes and the genotypes with the two homozygotes and the two heterozygos. This theory is complicated a little bit in the case of what are called sex-linked traits. Humans have 22 homologous autosomes. That's where most of our genetic material is located. but we also have two sex chromosomes. If you're male, you'll have an X and a Y chromosome typically, and if you're female, usually two X chromosomes.

Starting point is 00:18:34 The sex chromosomes are different in a number of ways, and one is that in males they're not homologous, right? You have two different chromosomes. In females, they are homologous because you have two X chromosomes, though there's some other differences there as well. I'll mention at a moment. Now, because of this difference between males and females when it comes to these chromosomes,

Starting point is 00:18:52 inheritance of traits that are encoded on either the X or the Y chromosome behaves differently to inheritance of traits that are encoded on all the other chromosomes. Because for all of the regular chromosomes, the autosomes, as they're called, you can inherit, or you will inherit, one copy from each parent. But for sex chromosomes, that's not the case. It depends on the sex of the parent and also of the child. So, for example, a son always inherits a Y chromosome from his father. He can't inherit a Y chromosome from his mother because his mother doesn't have one,

Starting point is 00:19:26 whereas a daughter always inherits her father's X chromosome. Her father only has one X chromosome, and she will always inherit that. So, in a sense, the probabilities are different when you're talking about X chromosomes, given the sex of the child and the parent. Now, this gives rise to different patterns of dominance and recessive traits. In a phenomenon called X-linked dominance, a dominant trait is carried on the X-chromosome. In this case, males and females are affected at the same rate because males and females both have one X chromosome at least. And in the case of a dominant trait, you only need a single copy of the corresponding allele to exhibit the trait.

Starting point is 00:20:08 So if the trait is dominant, then if you get that version of the gene, then you'll have the trait. If you don't then get at least one copy, then you won't have that trait, right? So in this case, males and females are affected at the same rate. Where it gets interesting is in X-linked recessive traits. So this is where a recessive trait is carried on the X chromosome. Now remember for recessive traits, if you're heterozygo, so that means you have one copy of the variant allele and one copy of the wild type or the regular allele,

Starting point is 00:20:40 you won't exhibit the trait because it's recessive, right? It means that the other version dominates. You only experience or exhibit a recessive trait, if you're homozygose for that recessive trait. So that was the yellow-yellow case in the P example we just talked about. So you need to have both of the corresponding alleles for the trait, homozygous in that trait, in order to exhibit the recessive phenotype,

Starting point is 00:21:07 the corresponding recessive phenotype. Now, if there's a recessive trait carried on the X chromosome, this means that females can be carriers if they're heterozygous for the trait, because they can have one X copy that has an allele for this trait and one that doesn't. But males can't be carriers because they can only have one copy of the X chromosome. So that means if males have the variant allele, they'll have the trait. If they don't have the variant allele, then they won't have the trait.

Starting point is 00:21:34 So males can't be carriers because they can't be heterozygos. This means that the X-linked recessive traits are much more common in males compared to females. So just to be clear about that, the reason for the difference is because effectively, in order for a daughter, to have the trait, she must inherit the affected allele from both parents. Whereas for a son to have the excellent processive trait, he needs to only inherit the affected allele from his mother. He can't inherit it from his father, because remember, sons always inherit the Y chromosome from their father.

Starting point is 00:22:14 Whereas the daughter, again, she needs to inherit an affected allele from the mother and the father, And so that's less likely. So women are much less likely to be affected by X-linked recessive traits. Examples of common X-linked recessive traits include colorblindness, muscular dystrophy, and hemophilia. Colorblind is probably the single most common one. Red-green color blindness is found in about 10% of men and only about 1% of women. There are some Y-linked traits, so these are traits that are carried on the Y chromosome.

Starting point is 00:22:45 However, they're much less common than X-linked traits, because the Y chromosome is much, much smaller than the X chromosome and contains comparatively very few genes, and most of the genes that it does contain relate to male reproduction. So there aren't really very many Y-linked traits. By comparison, the X chromosome has lots of genetic material that has nothing to do with reproduction. Again, because males and females each have an X chromosome, so that they can be genetic material that's a sort of relevant for both sexes, whereas for the Y chromosome, only men have it, so it can't contain genes that females need as well. right, that wouldn't really make any sense.

Starting point is 00:23:20 So evolutionarily, it's been selected to only have genes that are relevant for male reproduction. Now, I should say that there's a bit of a complication here because although females always have two copies of the X chromosome, not just one copy, it turns out that only one of those copies is ever actually active or utilized, like actively transcribing genes in any given cell. This is because of something called X chromosome inactivation. So in any given cell in a female's body, one of the X chromosomes will be active and being used to transcribe genes and all that. And the other one will be inactivated. It's kind of tightly wound up and unable to be utilized for anything. The reason for this is because otherwise females would have excessive production of all of the X chromosome genes compared to males.

Starting point is 00:24:20 because they would have twice as many, right? You'd transcribe twice as many genes, you'd have twice as many gene products, and that would result in various problems. So to balance that out, so that there's the same amount of production of genes, or roughly the same amount as in males, one of the X chromosomes is inactivated.

Starting point is 00:24:35 However, which X chromosome is inactivated is not the same in every cell in the body. So certain lines of cells will, at some point in the development process, have one of the X chromosomes inactivated, and others will have the other X chromosome inactivated. It's random. So what this means is that if a female is heterozygous, so she inherits two different alleles for the same gene, some of the cells in her body will be one of those alleles and the other cells will be the other alleles.

Starting point is 00:25:05 So overall, she will be heterozygous. It's just that any given cell effectively only has one active allele of that trait. But different cells in the body will have one versus the other of those traits active. Now this gives rise to a phenomenon which is only relatively recently been appreciated called skewed X inactivation. What this means is that although it's true that in different cells throughout the body, one or the other of the X chromosomes will be inactivated, it's not always an even split. Sometimes the X chromosome containing one allele will be more inactivated relative to the other. And particularly this can occur in certain cell lines, essentially because the inactivation occurs at various points. in the developmental process. So once a cell has been

Starting point is 00:25:51 inactivated, all of its descendants will also have that same X chromosome inactivated, or at least typically will. So if by chance a certain X chromosome in a certain cell at a key point in development becomes inactivated, then all of the daughter cells of that will also have the same X chromosome inactivated and therefore will exhibit the phenotype corresponding to whichever of the alleles on the X chromosome that particular cell has. Now this can be important because certain genes may only be expressed by certain cell types. So if there's a trait that, let's say, affects digestion, only in certain cells in your body,

Starting point is 00:26:27 will that particular gene actually be expressed and therefore the corresponding phenotype manifested. So it may not matter that your heterozygous for a trait if the only cells that actually exhibit that trait, that they actually utilize the corresponding genes have one or the other of the possible alleles. So you may only actually see one of them in action because the other traits, the other allel that you have is sort of buried in all the inactive X chromosomes in different cells in your body. And this is what's called skewed X inactivation. It's that, yeah, overall, it's random which X chromosome is inactivated. But if you have inactivation in certain cell lines relative to others, then one trait versus the other may be much more likely

Starting point is 00:27:12 to be expressed or much more commonly expressed in a particular individual. So because of this, scientists have disputed whether it even really makes sense to talk about X-linked recessive traits. They say we should just talk about X-linked traits, because most traits are not purely recessive or dominant in X-linked traits because of this skewed inactivation. It's sort of somewhere in the middle, depending on exactly which cells have the inactivated versus the active version of the X chromosome. So that's something to bear in mind when understanding X-linked and particularly ex-recessive traits.

Starting point is 00:27:46 But now we're going to move on from the sort of simple Mendelian inheritance of single binary traits which we've been talking about so far. And we're going to start talking about more complex, realistic traits, because it's true that there are some Mendelian traits observed in humans where there's only two versions, two phenotypes corresponding to each trait, and the trait is determined only by a single trait. single gene, which shows complete dominance, so that one of the alleles shows complete dominance over the other, and where the alleles segregate independently. So only a few traits in humans, and in other organisms as well, actually exhibit all of these classical Mendelian properties, right? In most cases, there are other things happen. It's sort of much more complicated. Some examples of purely Mendelian traits, or close to purely Mendelian traits in humans include albinism, colorblindness, Huntington's disease, and sickle cell

Starting point is 00:28:40 disease, because these all effectively are determined by a single mutation on a single gene, which is either dominant or recessive relative to the wild type trait. There are many reasons why Mendelian inheritance doesn't apply to most traits, and I've sort of already highlighted some of them. What we're going to do now through the rest of the episode is go through and progressively introduce sort of more complications or relax some of the assumptions that Mendel implicitly kind of made when he did his initial analysis on P plants, right? The first generalisation that we're going to make is consider multiple independent traits. So so far we've only considered a single trait one at a time, a single trait determined by

Starting point is 00:29:18 a single gene. Now we're going to examine multiple traits encoded by different genes, and in particular two traits at a time. We're going to talk about the law of independent assortment and kai square tests. So what happens if we consider the variation of two traits at once instead of just considering one trait. Again, at this point, we're still going to assume that each trait only has two phenotypes and that each trait is determined by a single gene. So this is still making the same assumptions as before. It's just now we're considering two traits and two corresponding genes at once instead of just one at a time. So in the example here, we can consider green and yellow pods and wrinkled versus smooth seeds. So those are the two

Starting point is 00:29:58 traits we're going to look at. Again, each is binary traits, so there's only two positive. here. Now the way you investigate this is to produce a slightly different cross to what we talked about before. So remember before we talked about the parental generation, the F1 and the F2 generation. Now we're going to introduce something called a dihybrid cross. This is a very important type of mating that's used in genetic studies where you mate two dihybrids together. So how does that work? Well, it's similar to what we talked about before, but it's a bit more complex just because of the two traits instead of one. So what you start with is your parentals.

Starting point is 00:30:35 Now, in this case, the parentals will be homozygous for two traits instead of just one trait. So this is the difference. In the original parentals for the single trait, we just had the green ones and we had the yellow ones. Then we cross them together. Now, in this case, we're going to do the same thing, but we're going to be looking at two traits at the same time instead of just one. So we're going to produce two pure breeding lines, each consisting of plants that have

Starting point is 00:30:59 two particular variants of the same trait. And the way that this is done is that you have one version of the plant that will be homozygous dominant and one that will be homozygous recessive. We established before that the yellow pod is the recessive. So we're going to ensure that one of our parental is homozygous yellow. And the other trait that's recessive, as it turns out, are wrinkled seeds compared to smooth. So smooth is dominant, wrinkled is recessive. what we're going to do is we're going to start out with one parental being homozygous yellow wrinkled. So this is how we describe it. We describe both traits.

Starting point is 00:31:34 And homozygous means that both of the factors that the parent has will be wrinkled and both will be yellow. And you do this again just by selfing for a bunch of generations of plants that have these two traits. And the other parent will be homozygous dominant, which in this case is going to be green pod and smooth seeds. Okay, so we've got our parental. So they're homozygous, but with different traits. Then the F1 generation will be produced, as before, from crossing those together. So we cross the smooth seed green pods with the wrinkled seed yellow pods, and we get the F1 generation. The F1 generation will all be dihybrid.

Starting point is 00:32:13 So this is where the notion of dihybrid comes in. The reason is because it will inherit one dominant factor of both traits from one parent and one recessive factor of both traits from the other parent. So all of the individuals in F1 will be genetically identical and they'll all be dihybrids. So they'll be heterozygous for each trait. As before, they will show the same phenotype, the phenotype of the dominant parent. So in this case, again, that's going to be smooth seeds, green pods. Now the really interesting part is when we look at the F2 generation.

Starting point is 00:32:46 And this is where the dihybrid cross comes in because we have our dihybrids, right? These are our F1 generation, heterozygous for each of the traits. We cross them together. in the single trait case, this resulted in a three to one ratio of phenotypes, right? In the dihybrid case, because we're looking at two traits, the ratio is going to be different because obviously there's more going on here. In this case, what Mendel found was a ratio which very closely approximated 9 to 3 to 3 to 1. And this is what you would expect from independent assortment of the two traits. So if the alleles independently assorted from parents to offspring, they're separately and independently

Starting point is 00:33:24 separated and passed on to offspring, you would expect to see this 9 to 3 to 3 to 1 ratio. So let me explain the significance of these numbers here. So where does 9 to 3 to 3 to 1 come from? To understand that, we just think about our punnet square. We just think about our punnet square diagram. In this case, it's going to be a bit more complicated because there are two traces. So the way that we can represent it is as a 4x4 by 4 matrix. Why 4 by 4?

Starting point is 00:33:50 well, because each of the two parents can pass on four different combinations of alleles. So that's going to be dominant of each, recessive of each, dominant one, recessive of the other, or vice versa, recessive of one and dominant of the other. So there's four different possibilities that each parent can pass on. And so that gives you four by four. That's 16 total possibilities. So that's where this nine to three to three to one ratio comes from, because if you add up those numbers you get 16. So the ratio is just telling us how many different combinations of genotypes there are for each phenotype. There are four phenotypes. You could have both dominant traits, both recessive traits, or one recessive trait and one dominant trait. So in this case,

Starting point is 00:34:38 that's going to be the two dominant traits. So this is green pods and smooth seeds. These are both the dominant traits, and this is also the combination of traits that's seen, or the phenotype that's seen in the F1 generation. Or you could have both of the recessive traits. So this is going to be wrinkled seeds and yellow pods. Or you could have one of each. So you could have wrinkled seeds and green pods or you could have smooth seeds and yellow pods. So any combination is possible. However, they're not all equally likely. There are going to be nine times as many of the dominant phenotype compared to the recessive recessive phenotype. So there'll be, this is what Mendel observed. He observed nine times as many green.

Starting point is 00:35:20 smooth seed offspring in the F2 generation as compared to the yellow pod wrinkled seed. And the reason for that is just because if you count the combinations, there are nine times as many ways that you can inherit at least one dominant allele of each of the genes from parents as there are of just getting the recessive of each. And then the two threes refer to the ratio for every single plant that has wrinkled seeds with yellow pods. There'll be three plants that have wrinkled seeds with green pods, and also there'll be three plants, again, that have yellow pods with smooth seeds. And then again, for every one of these double recessive, wrinkled and yellow, there'll be

Starting point is 00:36:08 nine plants that have both of the dominant phenotypes, which is smooth and green. So that's the 9-3-3-1 ratio, and that comes just from independence, inheritance, of the two genes from each of the parents. This is called the law of independent assortment, that the different factors for different traits are inherited independently from one another. As we discussed before, the molecular basis of this is the independent sort of shuffling and distribution of the chromosomes during meiosis when the gametes are produced. Now, astute listeners may have realized that independent assortment only makes

Starting point is 00:36:50 sense if you're talking about traits that are coded for, that the factors or the genes are on different chromosomes. Obviously, if two traits are coded for by genes on the same chromosome, then they can't be inherited separately from each other, right, because they're on the same chromosome. Well, it gets a bit more complicated because of crossing over. Now, I've talked about crossing over in previous episodes. That's effectively where bits of the chromosome sort of swap over. So you get sort of mix and matches, where you don't actually inherit a single chromosome intact from your parents, you actually inherit bits of chromosomes from, say, your father and bits from your mother, which are then swapped over and

Starting point is 00:37:24 kind of pasted back together again. So it turns out that independent assortment is valid as long as the two traits are either on separate chromosomes, then it's fine. They'll be inherited independently, or if they're far away from each other on the same chromosome, because then crossing over will likely separate them as well, or, you know, we'll have the same chances of separating them as across different chromosomes. However, if two traits are coded for by genes that are close together on the same chromosome, then you won't observe independent assortment. Then they tend to be inherited together because they're close to each other and are likely passed along. We'll talk more about that in a future episode when we talk about genetic mapping,

Starting point is 00:38:02 because this issue of how often the traits are inherited together being related to how close they are on a chromosome is very useful for genetic mapping. But for the rest of the episode, I'm not going to talk too much about that. And mostly I'm going to assume independent assortment of traits, meaning that they're either, the traits are located on separate chromosomes or far apart on the same chromosome. And this is true for most traits, right, because there are 22 autosomes. So if you pick two random traits, likely they're going to be on separate chromosomes. Even if they are on the same chromosome, they're more likely to be far apart than close together. So most traits assort relatively independently, but you should be aware that, again, Mendel got

Starting point is 00:38:38 lucky that he didn't happen to find any pairs of traits that were located close together on the same chromosome because they wouldn't have obeyed this law of independent assortment. And so for those, you will see different ratios than the 9 to 3 to 1. But henceforth in this episode, we're going to ignore that complexity. We'll come back to it in a future episode. Now, in order to test whether two traits are inherited independently from each other, or also whether they exhibit this 9 to 3 to 1 ratio more generally, because there's other reasons why that ratio can fail, statistical methods are used to effectively involve counting

Starting point is 00:39:13 well, how many offspring do we expect to be in each of these four different phenotypic category? So how many are going to be double dominant, how much are going to be one dominant, one recessive, for each of the two ways of that, and then how many are going to be double recessive. You calculate how many are expected given independence, and then you count how many are actually observed in each of those classes, and then you conduct a statistical test called a kai square test, and you get a number, which indicates sort of how likely it is that you would observe that pattern of traits, given that the inheritance was actually independent of each other. And this will tell you effectively how likely it is that the traits truly are inherited independently

Starting point is 00:39:52 or whether there's some deviation from the 9 to 3 to 1 number. And again, as I indicated, deviation from this 9 to 3 to 3 to 1 can indicate that the traits are not assorted independently. That is that they're located nearby on the same chromosome. Or it can indicate that we have something other than the simple complete, dominance that we talked about. So remember, that's another one of the assumptions that we need in order to derive the 9 to 3 to 3 to 1 ratio. We need to assume that there's two different forms and only two forms that are relevant to determining the traits, so two alleles. We need to assume that each trait is determined only by a single gene locus. We need to assume that those

Starting point is 00:40:33 gene loci are located on separate chromosomes or far away on the same chromosome, which gives independent assortment. And we need to assume that when you have two different alleles, so when you heterozygos, exactly one of the corresponding traits will be manifested, so that is the dominant trait. So these four assumptions are quite rigid, and as I said earlier, only in rare cases are all of these assumptions actually satisfied. So now we're going to move on and talk about modification of Mendelian ratio. So this is when the numbers differ from the expected 3 to 1 in the single gene case, or more often we look at the 2 gene case, so the 9 to 3 to 3 to 1 ratio. So what happens when we get deviation from that?

Starting point is 00:41:16 So I've just explained one source of deviation is when the traits don't segregate independently, and that occurs because of traits that are located close together on the same chromosome. We won't talk about that further here because we'll discuss that in a further episode. But there's still three other reasons why we can have deviation from these Mendelian ratios. The first one we'll talk about is when we have violations of the assumption of complete dominance. So complete dominance is the classic case where when you're heterozygose, so you have one allele each of the two different versions, only one corresponding trait is observed. So in the pod case, we had either green pods or yellow pods.

Starting point is 00:41:55 Those were the only two possibilities. When you were heterozygous, you had a green pod, right? Same with the wrinkled versus smooth peas. There are many cases where we don't see complete dominance. We see something else happen. So you can have what's called incomplete dominance. This occurs when one allele of a gene is not consistent. completely dominant, you also see some of the phenotype coming through from the other

Starting point is 00:42:15 allel as well. So in this case, in the case of incomplete dominance, you can actually distinguish the phenotype of heterozygotes from homozygotes. And so in this case, the phenotypic ratio will actually be equal to the genotypic ratio. So instead of seeing three to one, which is what Mendel saw with the green versus yellow pods, in this case you'll actually see one to two to one. You'll actually distinguish the heterozygotes from the homozygotes. Color, pigments are often a common case. A pigmentation is a common case of incomplete dominance because you'll often see intermediate cases when you're heterozygous. So the color of fur in animals, for example. You may have a very dark color or a very intense coloration if you have double dominant,

Starting point is 00:42:56 if you have double recessive, you may have a very light or no pigmentation, and you may have an intermediate pigmentation if you are heterozygos. So in that case, there's incomplete dominance. You actually see the difference between intermediate cases when you're heterozygous and then the full dominant case. So that's quite common for many types of genes. Another case is what's called co-dominance. The difference between co-dominance and incomplete dominance is that in incomplete dominance, what you see in heterozygotes is an intermediate trait. You don't see either the dominant or the recessive trait. You see something different, but it's sort of intermediate. You can place it on a scale, like a spectrum, and it fits in the middle somewhere. In the case of codominance, a heterozygote will

Starting point is 00:43:38 actually exhibit both phenotypes. So they'll exhibit the phenotype of the dominant and the recessive. One example would be blood types. So in blood types of humans, there are actually three different alleles. So there's the allele that gives you the A antigen on your blood cells. There's the allele that gives you the B antigen. And then you can also have an allele that doesn't give you any of those. So three antigens, essentially A, B or neither. So this is actually a case where we see violation of the assumption that there's only two alleles. In codominance, we can actually, well, codominance can occur when we actually have three alleles. Now, in the case of blood types, just to explain how that works, if you are homozygous A, that means you will only have A

Starting point is 00:44:22 antigens on your red blood cells, and therefore you will only exhibit that phenotype of, well, having the A antigenes, right? Likewise, if you'll be homozygous, you will exhibit the B trait of having the B antigen. If you have the allele corresponding to neither of these antigens, so this is called O type, right? Your type O, you don't have either of the antigens. But if you're heterozygous, if you have 1A and 1B, you'll actually exhibit both traits. So you'll have the A and the B antigens. So this is where the codominance comes in.

Starting point is 00:44:56 Co-dominance is observed in individuals who are heterozygous, they have 1A allel and 1B allel. They actually exhibit both antigens, the A and the B antigen. So we see the difference between incomplete dominance versus codomitants in incomplete dominance. The heterozygots have an intermediate phenotype, whereas in codominance, heterozygotes actually have both phenotypes, so A and B in this case. And codominance can arise because of multiple alleles instead of just the regular two will. Another reason why we can have variation or modification of Mendelian ratios is because of what

Starting point is 00:45:32 are called recessive lethal alleles. So sometimes instead of observing the usual 3 to 1 or 1 to 2 to 1 phenotypic ratios, you'll actually see a 1 to 2 ratio. And typically there, the reason we would see that is because all of the recessive homozygote combinations are screened off. So we don't observe them. We don't observe any offspring with that combination because that combination is actually lethal. And of course, some traits are not 100% lethal, but they can reduce the survival rates.

Starting point is 00:45:59 And so those sublethal mutations will also alter the phenotypic ratios, but not to the full extent. So you'll see ratios that are in between 3 to 1 and 2 to 1, so maybe 2 and 1 half to 1, which is an indication that some of the homozygote recessive offspring are surviving, but also some of them aren't, and therefore the ratio is affected there. Okay, so far we've seen how violations of two of our assumptions can lead to variations of classical Mendelian ratios. So if we violate the assumption of complete dominance, we can have results like incomplete dominance. There's other possibilities as well, but that's a simple example to understand.

Starting point is 00:46:37 If we violate the assumption of only two alleles, we can see codominance, where heterozygotes exhibit traits of both versions of the allele. Next, we're going to talk about violation of the assumption that a single trait is determined by a single gene. This is called gene interaction, where we actually have interactions between different genes, and so you can't just look at them separately, but you have to look at their combination and how they interact with each other. One of the common manifestations, not the only one, but one of the common manifestations of gene interactions is called epistasis. I mentioned this in the introduction. Epistasis occurs when genes are not independent from each other, but one of the genes, or an allele of one of the

Starting point is 00:47:17 genes, screens off the phenotypes from one of the other ones. We can understand this as a kind of masking effect. So what we mean by screening off or masking a trait is effective with it. If you considered two different traits each by themselves, you might see two phenotypes corresponding to one trait and then two phenotypes corresponding to the other trait. But if you combine them together and look at their inheritance in combination, you might only see three different phenotypes instead of the original four different phenotypes, because one of the phenotypes has been screened off by one or more of the others. So remember when we talked about the Law of Independent Assortment and the yellow and the yellow green pods versus the smooth versus

Starting point is 00:47:56 wrinkled peas. So that gives rise to four different phenotypic possibilities. You know, smooth yellow, smooth green, wrinkled yellow, wrinkled green, right? What's interesting in epistasis is that when you consider, you know, two or even more than two genes together, there's an interaction between them. So some of those phenotypes are not actually observed. So instead of seeing four different phenotypes in some ratio, you will only see, say, three different phenotypes or even two different phenotypes. So this is a variation of the three of the nine to 3 to 3 to 1 ratio we'd expect to see when two genes are independent from each other, when two genes interact with each other, we see violations of the 9 to 3 to 3 to 1 ratio.

Starting point is 00:48:35 So let's talk about some different violations or variations from that ratio and why they can occur. Epistasis is a very complicated phenomenon, so I'll just give a very brief introduction here. So in recessive epistasis, the 9 to 3 to 1 ratio becomes a 9 to 3 to 4 ratio. And what's happened here is that one of the heterozygote combination, so one of the threes, has joined together with the one, so the homozygote recessive. They've joined together and turned into a four, right? So nine to three to four, that's just three plus one together. What this means is that the homozygote recessive is no longer distinguishable from one of the heterozygotes.

Starting point is 00:49:13 They have the same phenotype. How can that happen? Well, one way that you can observe a nine to three to four ratio is if you have, if genes code for proteins operating in the same pathway. So suppose we have some metabolic pathway that utilizes proteins, say that are enzymes or something, one of which operates after the other in the pathway. Now, if both alleles are functional, we'll have the wild type phenotype, so that's the nine in the ratio, right? If the first protein in the pathway can function normally, you'll get a new phenotype. But if the second protein cannot, the double recessive and the second

Starting point is 00:49:48 heterozygote will be indistinguishable. So let's sort of reason that. to make sure we understand here. We have two proteins in a pathway. Let's think about the homozygote recessive case. The homozygote recessive case is that neither of these proteins works correctly. In that case what we'll see is some phenotype corresponding to when neither of those works. Now let's suppose we consider a heterozygote case where the second protein works, but the first one doesn't. Now you see the point here because there's an ordering to them, it doesn't really matter if the second protein works if the first one doesn't work because you never get to that. And so in that case, you'll see the same phenotype as you did in the case of the homozygote

Starting point is 00:50:30 recessive, whatever phenotype corresponds, maybe some severe deficit, right? Whatever phenotype corresponds to having neither of those proteins working. Because it doesn't matter if you have the second protein working if you never get to that point in the pathway because the first protein isn't working. So this is how we can have a case where one heterozygote gives the same phenotype as the homozygous recessive. So now there's only three phenotypes. There's the wild type phenotype corresponding to both proteins working. There's one phenotype corresponding to the first protein works, but the second one doesn't. Maybe that's a mild impairment. And then there's the third phenotype, which is what you get when either neither protein works or the second one works, but the first one doesn't.

Starting point is 00:51:11 And so either case you get the same result, which is the severe impairment. So that's one example of how you can have a screening off of a phenotype. You don't actually see one of the of these heterozygote phenotypes because it's screened off by the homozygote recessive in this case. In a similar case, we can also get a 9 to 7 ratio. The reason we could have that is if both proteins were necessary for even the first step to occur in a chain. So if you need both proteins, even for the first step to occur, if you're missing either protein, then nothing will happen. So in this case, either your wild type, which is the 9, or if you have any deviation from wild type at one gene lycite or the other, it doesn't matter which one, you'll get the

Starting point is 00:51:52 the other phenotype, probably a disease or variant phenotype, right? So the difference between whether you're going to see the 9 to 3 to 4 or the 9 to 7 is essentially whether the proteins, in this sort of hypothetical example, operate one after the other, or whether they operate kind of at the same time, or at the same step, at least, of the metabolic pathway. Another type of epistasis is dominant epistasis. So instead of seeing 9 to 3 to 4 or 9 to 7 ratios, In dominant epistasis, you'll see a 12 to 3 to 1 ratio. And we see what's happened here. Instead of one of the heterozygote phenotypes combining with the recessive,

Starting point is 00:52:26 so that's where we got the four from, or even the seven, in this case, one of the heterozygote phenotypes is combined with the dominant phenotype. So that's what's called dominant epistasis. It effectively described which phenotype has been kind of combined with one of the heterozygotes phenotypes. So in dominant epistasis, you might have a dominant allele that masks the expression of the gene at the other load. So if you have one gene, it doesn't matter what the other is doing. If you have one allele at one of the genes, then the other one doesn't matter. The other one only comes into play as kind of a backup if you don't have the first allele.

Starting point is 00:52:59 In fact, there are many, many other types of ratios that are possible, including 13 to 3, 15 to 1, and many other combinations. Basically any way of sort of adding the 9s, 3s, and 1s together, you can potentially get those as a result of epistasis. It just depends on which gene loci are involved, how they interact with each other, are they suppressing each other, modify each other, regulate each other, and exactly what the pathway is. So that's why I said epistasis is quite complex, and I won't try to explain all of the different possibilities and how they come about, because there's many ways they can come about. The important point is epistasis involves a screening off or a masking of the phenotype corresponding to one particular genotype, often one of the heterozygotes, or it can be others as well, and that occurs

Starting point is 00:53:42 because of interactions between the genes and how they manifest the phenotypes. Often you can think about this as because the genes operate in a pathway where they support each other or suppress each other or regulate each other or both needed or one is needed after the other. And therefore, you will only see certain combinations of phenotypes, thereby resulting in modifications of the standard 9 to 3 to 3 to 1 ratio. So let's recap. At this point, we've talked about four different ways that you can have variations from the standard binary, complete dominant, independent assortment case of the 9 to 3 to 3 to 1 genetic ratios found by Mendel.

Starting point is 00:54:26 So one is if you have a failure of independent assortment, so that occurs when you have genes on the same or nearby to each other on the same chromosome. Another is if you have the failure of complete dominance, so for example, incomplete dominance, Another is if you have multiple alleles, instead of just two alleles, you have multiple alleles for the same gene, and they interact with each other in some way, so we see that in codominance, for example. And the fourth case is if we have multiple different genes that interact with each other in various ways in determining the same trait. And so that occurs in epistasis. Yet another case where we can have violations of the standard 9 to 3 to 3 to 1 ratios

Starting point is 00:55:06 is because of varying levels of penetrance and expressivity. So let me just see. explain what these are. In a population, you may have individuals who have the same allele for a given gene. However, based on their environment or possibly interactions with other genes, these individuals will have the same, these individuals will have different phenotypes despite having the same allele for a given condition. So effectively, this is just a way of saying that for many alleles, just having that allele doesn't mean that you will necessarily have any particular corresponding phenotype, it just affects your probability of having that phenotype because of the effect of other genes that you have plus the environment that you're in. So the penetrance is a way of

Starting point is 00:55:46 describing, it's a percentage of individuals with a specific allele who will exhibit the corresponding phenotype. Expresivity refers to the degree to which a gene is phenotypically expressed in a single individual. So expressivity is defined for a single individual. Penetrance is for a population of individuals. So you can think of it as penetrance is the degree to which you get a trait. in a population with the corresponding phenotype, expressivity is the degree to which you get a trait in an individual with the corresponding genotype. Now, simple Mendelian traits, like those we've been talking about

Starting point is 00:56:20 with the pea plants, have 100% penetrance and a fixed expressivity for a given individual. So 100% penetrance, meaning if you have the corresponding allele, you will necessarily always have the corresponding trait. 100% of the time, every individual in the population will have the corresponding trait if they have the corresponding allele. Nice and simple. And a fixed constant expressivity means that the trait always manifests in the same way for every individual who has it.

Starting point is 00:56:45 So in the case of the plants, for example, you have green pods or yellow pods. Every individual who has the relevant allele will have either green or yellow as appropriate for their allele. And also, the colour is always the same. So it's always the same shade of green or the same shade of yellow. There's not sort of varying degrees of greenness or yellowness depending on the individuals. that would be varying expressivity. Now, in reality, most traits have variable penetrance and variable expressivity. And typically, this is because of gene interaction effects and also interactions of genes

Starting point is 00:57:17 and the environment. So variable penetrance or variable expressivity can be due to epistasis, or it can be due also to incomplete dominance or co-dominance or other factors that we talked about. It's a way, basically, there's a ways of quantifying these sort of interaction effects. But they can also be due to interactions between genes and the environment, which is something, again, that we haven't talked about. out yet. So it's important to bear in mind that there are a lot of assumptions made or a lot of strict conditions that have to be met in order to get the 9 to 3 to 3 to 1 ratios that

Starting point is 00:57:47 men will discover. So they are very useful to understand the underlying process of inheritance, but you won't actually observe those ratios in most real cases because of phenomena like incomplete dominance, epistasis, variable expressivity and so forth. So to conclude this episode, we'll talk a little bit more about this relationship between genotype and phenotype and inheritability and nature versus nurture and so forth, and we'll do so first by introducing the notion of a quantitative trait. So a quantitative trait is a phenotype that depends on the cumulative combined actions of many genes together, plus the environment. We'll talk about the genes first, though, and we'll get the environment in a bit

Starting point is 00:58:24 later. So these traits typically vary across individuals over some range, which is why they're called quantitative, because so far we've talked typically about categorical traits, often binary, so like yellow or green or tall or short plants or wrinkled or smooth, but also some are categorical, like if you think about blood types, there's A or B or both, or there's intermediate phenotypes such as that we see with incomplete dominance, but with quantitative traits, it's typically more complicated than that. Typically we have either continuous traits, like, say, height, which vary across a continuum, and you can kind of be any height, you know, within some range, and those so-called metric or meristic traits. And I don't, I don't.

Starting point is 00:59:03 don't know why they have such a weird name. Basically, these are discrete traits where they can be quantified, but they can only have discrete whole numbers. So this would be, for example, the number of seeds per sheath of wheat, or something like that. So what unifies quantitative traits is that there's essentially more than two possible phenotypes corresponding to a given gene, and often a quantitative, measurable number of them. So height is sort of the classical example of a quantitative trait. Most human traits, and indeed most traits in most organisms, are quantitative traits. So those, the sort of simple binary traits where we only had two possible phenotypes each corresponding to one allele, that's very rare. And as I mentioned in humans, there aren't very many

Starting point is 00:59:45 examples of these Mendelian traits. So in order to understand and describe real genetic variation in real organisms for most traits of interest, we need to look to quantitative traits. Now, the problem with studying quantitative traits is that because of the complex interaction of many genes, which gives rise to the quantitative traits and also the interaction with the environment, we have very strong variable expressivity and typically variable penetrance as well, so incomplete penetrance. Not every individual who has certain phenotypes will have the corresponding genotype and vice versa. Because of all of these complexities, you can't really follow the segregation of the segregation of each separate gene across generations like Mendel was able to, because there's too

Starting point is 01:00:31 many genes and they don't have a simple relationship between each phenotype and genotype. So even if you could identify all the genes, two individuals with the same genotype would still have different phenotypes because of interactions with the environment. And then even if you have small changes in the genotype, that could result in large differences in the phenotype. So maybe if there's 50 genes that determine a trait, you might have the same allel for 48 of them, but those last two can have interaction effects with all the others such that you have a very substantial difference in phenotype. So there's too much complexity going on so that Mendel's methods don't really

Starting point is 01:01:03 work with quantitative traits. However, we can still use statistical methods to describe them and understand them. So one very useful finding or phenomenon is that for quantitative traits, the distribution of phenotypes often approximates that of a normal distribution. So this is the bell curve that you would be familiar with, presumably. And this occurs essentially for statistical reasons, because it turns out that the sum of many small effects, or the sum of many binary choices, is distributed or will become distributed according to a normal distribution. That's a result of a statistical finding called the Central Limit Theory for those who want to look this up. You can also look up Bolton Board, that's B-A-L-T-O-N board, if you want to see a simple example of this.

Starting point is 01:01:48 I won't try to describe that here, but basically it's sort of a mathematical result that if you have many small influences which add together, you'll typically get this sort of bell curve distribution of traits in the population. So height, again, classical example, is normally distributed in the human population. Basically because some people have all of the alleles that make you very short, some people have all of the alleles that make you very tall. Most people have a mixture which gives you somewhere in the middle and then it's sort of proportional to either side, right? Now, one way that we can get our heads around the complexity of quantitative traits is to try to study the total variation of the phenotype in a population. So, again, think height here.

Starting point is 01:02:24 That's probably the easiest case to think about, but many traits can be modeled this way, too. So let's imagine the total phenotypic variation in the human population with regard to height. So that's the variance of heights across all humans. So we measure the total amount of phenotypic variance for height, and then we can decompose that conceptually, and then we'll talk about how to actually measure these, we can conceptually decompose it into the variance due to the genotypic variance, so this is variance due to different genes in different individuals,

Starting point is 01:02:51 plus the variance due to different environments across different individuals, plus variants due to the interaction effects of gene with environments. So we've got genotypic effects, environmental effects, and interaction effects. So conceptually, all of the variants of any trait can be decomposed into these three categories. Of course, the trick is, how do I know how much is each, right?

Starting point is 01:03:11 How do you know how much variance is due to genes versus environment versus interactions? Well, there are ways of measuring this, and in order to understand that, we have to introduce these two sort of constructs called heritability. Heritability is a statistic used in genetics and also selective breeding, which estimates how much of the variation of a phenotypic trait is due to genetic variation in that population. There's two different and distinct concepts of heritability. Broad sense heritability and narrow sense heritability. And just to be confusing, the way that they're differentiated is that broad sense is given a capital H and narrow sense is given a lowercase H. So obviously that's not incredibly clear.

Starting point is 01:03:53 But I'll just talk about broad versus narrow sense heritability. Now these both measure something about how much of the variance of a trait is due to genetic causes, but they measure it in different ways. So it's important to understand the difference. Broad sense, I think, is a bit easier to understand. Broad sense heritability is just the amount of variance of the traits, so the variance in height, that is caused by the variance in genotype, the genotypic variance in that population,

Starting point is 01:04:24 divided by the total phenotypic variance. So it's just a proportion. Broad sense heritability is the proportion of all the variation in the phenotype that's caused by or explained by, it's probably a better way to say it, that's explained by the genotypic variation. So that makes sense, right? We just get the genotypic variance divide by the total phenotypic variance,

Starting point is 01:04:43 and that's the amount that's explained by different genotypes, right? Why do we need to complicate it further? Well, the reason that we need to introduce an additional concept of heritability, narrow-sense heritability, is because broad-sense heritability is actually still too complicated

Starting point is 01:04:59 to usually understand or model in many cases. Because broad-sense heritability that is due to all genotypic variation and that includes all complex gene interaction effects. It doesn't include any interaction with the environment, but it does include all interactions between different genes or different alleles of different genes, right? And that's all very complicated, as we've just been talking about,

Starting point is 01:05:21 with the modifications of median ratios. And so it's often very hard to model that or to use it to predict what an offspring will be like given their parents. So instead, often what we want to do is try to divide out, essentially the parts that are easy to understand, the parts of genetic variation that are easy to understand versus the ones that are hard to understand. And so we introduce this distinction

Starting point is 01:05:44 between additive genetic variance and dominant genetic variants. So the additive variance is the phenotypic variance due to additive effects of alleles. So this is sort of nice, right? Because then you can imagine there's 10 different genes that affect height, let's say. If we knew the effect of each allele,

Starting point is 01:06:04 Again, let's say there are two alleles for each of the 10 genes. So all you have to do is measure the effect of having each of these alleles and you sort of add them up, and we get an additive effect. And then you can work out how the genetic effect of all of these genes just by adding them together. So additive variance is nice and simple and easy to explain, easy to model, because it's just additive. You just add up the effect of each gene separately from each other. You might not know what all these genes are, but at least you can model them by sort of simple additive measures. Now, the dominance variance by contrast is the phenotypic variance due to effects of dominance. And this is not necessarily complete dominance.

Starting point is 01:06:39 This can be incomplete or co-dominance, different forms of dominance. And this is basically modeled as a deviation from the additive effect. So think of the additive variance as being the easy-to-model bit, and then the dominant variance as being the deviation from that, which is due to complex interaction effects and dominance and things like that, right? The crucial insight here is that since each parent possesses a single allele per gene and gives a single allel to each offspring, parent-offspring resemblance, so the actual resemblance with each parent to their offspring, depends on the average effect of any single allele.

Starting point is 01:07:16 That means if we want to understand how similar in phenotype a child is likely to be to their parent, we need to look at narrow sense heritability, which is only the variation in phenotype caused by additive genetic, effects, not that caused by dominant genetic effects. So, Narrow Sense heritability is very important because, as I said, it's easier to model, because it's easier to model the additive effects compared to the dominance effects and all the complex interactions. But also, it's important for family health and for selective breeding of animals and plants, because it tells us more specifically how similar in phenotype offspring would be to their

Starting point is 01:07:52 parents. Broad heritability doesn't really tell you that. I think the easiest way to understand the difference is to think about two heterozygote parents. So let's go back even to our simple case of a single trait determined by two alleles and complete dominance. Now, if you have two heterozygote parents, both of those parents could exhibit the dominant trait, well, will exhibit the dominant trait, which is because they're heterozygote, so they have one copy of the dominant allel, so they'll show the dominant trait. It's possible, however, for those two heterozygote-segote-recessive child. In fact, one in four of their offspring will be homozygote recessive,

Starting point is 01:08:29 if we make the standard assumptions, right, of independent assortment, all that. Now, crucially, a homozygotic recessive child will not share the phenotype of whatever this trait is, that they won't have the same phenotype as either parent. And so if you were to just sort of look at that simplistically, you would say, well, there's no relationship between the child's phenotype and the parent's phenotype. it's not heritable at all. Now obviously we know because of how inheritance works, but they did inherit. I mean, the only reason they were able to, the only reason they have that genotype, the homozygote recessive, is because they inherited a copy of each allure from their two parents,

Starting point is 01:09:06 right? So it's in a sense completely inherited. But if you just look at it in the simple sense of how similar is their phenotype compared to their parents, it's not the same at all. And so the way we sort of understand the difference is that this trait, whatever the trait is, is heritable in the broad sense, but not in the narrow sense, at least in the case of the homozygotic recessive child, right? Narrow sense heresability only tells you the extent to which phenotype is determined by additive genetic traits, and when you have a simple case of complete dominance, that the trait isn't additive, right?

Starting point is 01:09:38 There's an additive effect there, which you see in the heterozygote case, but in the case of the homozygote recessive, you actually, you don't see that at all. You see a substantial deviation from the additive case. in which case we say that the narrow sense heritability of that trait is much lower than the broad sense heritability. The trait may be 100% broad sense heritable, but only a small extent narrow sense heritable, because only a small amount of the total genetic variance is actually additive in nature. This may be a little confusing because when we were looking at simple Mendelian genetics, we often made the assumption of complete dominance, and we regarded that as sort of the simple case,

Starting point is 01:10:18 anything other than complete dominance is results in modification of Mendelian ratios, which is true. But remember, we're now talking about quantitative traits. In the land of quantitative traits, dominance is actually complicated. The simple version, the simplest model, the baseline in the land of quantitative traits,

Starting point is 01:10:35 isn't dominant recessive. It's actually just additive effects. It's imagining each allele gives you some unit of an effect, which you can just add up over all of the different gene loci that affect that trait, right? So that's why in this case, when there's actually gene dominance or other types of gene interaction effects, codominance, incomplete dominance, epistasis and all that stuff,

Starting point is 01:10:53 that actually gives you deviations from pure additive effects. The narrow sense heritability will be less than the broad sense heritability. If everything was purely additive, then the narrow sense heritability would be the same as the broad sense irritability. But in reality, some effects are additive, but many effects are not because of all the complex interactions. And these deviations mean that the narrow-scent heritability is less, and it means that the phenotype of offspring won't be as similar to their parents as you would otherwise expect.

Starting point is 01:11:25 And again, the heterozygut-parence giving rise to a homozygote-recessive child is a very clear example of that, because although that condition is 100% heritable in the broad sense, it's not very heritable in the narrow sense because of that possibility of having the homozygote recessive. So, hopefully that's somewhat clear. It's this distinction about different ways of defining heritability. And I think the lesson there is it's important not to think of heritability as purely in terms of how similar are the traits of children compared to their parents. Because that's only one way to think about it. I mean, sometimes you want to think about it that way.

Starting point is 01:11:59 That's narrow sense. But broad sense heritability is sort of more comprehensive. It actually tells you of all of the variation in phenotype how much is due to genotype variation. And the thing there is that although children inherit their genotype from both parents, they mix up the genes of the two parents. And so they don't inherit a whole genotype from either of their parents. And so if gene interaction effects are very common, you may have instances where conditions are completely genetically inherited,

Starting point is 01:12:27 but narrow sense heritability is very low. Because each child receives all of their genetic material from their parents, but they mix it up so that the set of gene interactions is going to be very different to either of their parents. and so the actual phenotype will look very different. So this is a very important insight that just because you're inheriting all your genes from your parents, it doesn't actually mean that the phenotype will be inherited. Inheritance of genotype does not imply inheritance of phenotype when there are gene interaction effects. I think that's the key insight here.

Starting point is 01:12:58 Researchers have developed many mechanisms of trying to identify where the genes are that affect quantitative traits. The older method of doing this used genetic mapping, again I'll talk more about that in a future episode. This particular technique called interval mapping, where basically you construct a genetic map with markers that you can identify on a chromosome for a given organism, and you try to identify which markers are associated with certain traits. The idea is you're looking for any markers, hopefully multiple markers, each of which will have some contribution to the overall effect, right? So each of these markers that is associated with some kind of quantitative It's called a quantitative trait locus.

Starting point is 01:13:38 So this is just a region on some chromosome that is associated with a quantitative trait of interest. It doesn't necessarily tell you what specific gene is associated with the quantitative trait because a marker will only localize to a certain region on a chromosome. It won't tell you exactly which gene it is. To do that, we have to use more sophisticated methods called genome-wide association studies. So genome-wide association studies are kind of the state-of-the-art when it comes to trying to determine the genetic basis for quantitative traits. So the way genome-wide association studies work is that you pick a trait that you're interested

Starting point is 01:14:13 in, let's say height, to go with their example. And then you sequence the genomes of a large number of organisms from a population of interest. So let's say we're interested in the variation in height in the United States. So you grab 10,000 individuals from the US population, ideally a representative sample, right? You sequence all their genomes, and then what you do is you align them against each other, and then you examine each nuclear tide one by one across the number. entire genome. And you look at all of the variations in that nucleotide across all of the individuals in your population. Some nucleotides will have a lot of variation, some will have very little

Starting point is 01:14:46 variations, some will have intermediate amounts depending on what that nucleotide codes for. Is it part of a gene? Is it not? Is it a point where there's able to be variation and still produce viable offspring? Is it not? Right. So it will depend. And these individual differences are called single nucleotide polymorphisms or SNPs. So there's single points in the genome that differ across individuals. And what you do is you use complex statistical methods to try to find which of these SNIP are most associated with the phenotypes of the individuals whose genomes you sequenced. I've got to mention we also measure the height of each of the individuals, right? So we've got their genome, we've got their height, and you look for correlations between certain types of SNPs,

Starting point is 01:15:24 certain polymorphisms at particular sites in the genome, and height. And so what you're looking for is sites where people who have this particular polymorphism tend to be taller or shorter than average. And you're hoping to find as many of these as you can on the basis that these are genetic loci that contribute to the quantitative trait of interest. These genome-wide association studies took off in the early 2000s after the sequencing of the human genome. But for the first decade or so, there was a lot of excitement and a lot of hype, but relatively few robust, replicable results of finding reliable sort of gene, or candidate genes, at least, for particular traits. and more recent work, which has used much larger sample studies.

Starting point is 01:16:08 So earlier ones used maybe tens of thousands. More recently, we're using hundreds of thousands in the sample and also various other methodological and statistical changes which have been implemented. And we're starting now to find, I was just looking at some review papers on this, we're now starting to find more robust results that are more replicable and that are larger effect sizes that are able to account for more genetic variation.

Starting point is 01:16:29 So the results are more promising than they were 10 years ago. However, there still is, I would say, a relative sense of sort of disappointment, because we were hoping that with these genome-wide association studies, we would be able to find, you know, maybe there's not one gene for height, right? I mean, we know there isn't, but maybe there's a handful of genes, a few dozen genes, each of which contribute a meaningful amount to height, or people aren't that interested in height, but in various disease traits, for example, or things like intelligence, all sorts of other traits of interest, right? People were hoping that we would at least be able to find a bunch of important genes and, you know, mutations or variations in those genes that accounted for the variations in the trait. But what we've actually found with these more recent studies that have more the updated techniques, is that many of these quantitative traits appear to be far more polygenic than was even thought before. So maybe it's not the traits given by a few dozen genes or hundred genes. Maybe it's that the traits given by hundreds or even thousands of genes, each of which contribute a tiny effect spread across the genome.

Starting point is 01:17:28 Also, it's becoming increasingly clear that much of the important variation is due to very rare allelic variants. So instead of there being like two or three common alleles and you can see what the effect each of them has on the phenotype, maybe there's dozens of alleles, some of which are very rare, but have important effects when you see them. And that is much more difficult to detect statistically because you need much larger sample sizes to find the impact of rarer alleles. So what we're seeing is that more allelic variants are important, and many of them quite rare,

Starting point is 01:17:59 so they're hard to study, and also more genes are important. Traits are highly polygenic, not just determined by a small number of genes, but many, many genes, like hundreds of genes or even thousands. Another thing that's becoming increasingly clear is that gene interactions with the environment and gene-to-gen interactions with each other are both very important. Geno-wide association studies can incorporate this so they can look not just for individual sites that correlate with the phenotypic trade of injuries, but they can look for gene interaction sites. So if I have both of these alleles, say variants in this place and this place, if I have both

Starting point is 01:18:30 of those, does that correlate with a difference in the phenotype? So those are interaction effects. You can look at that in genome-wide association studies, but the problem is the number of potential interactions goes up exponentially with the number of gene sites that you're interested in. And also, you might not want to consider just two-way interactions, but three-way interactions or four-way interactions. Maybe you need to have all four of these allelic variants in order to show an effect of some phenotype. So the problem is that this appears to be this type of gene-gene interaction effect, as well as interactions with the environment in different populations of humans in like different countries or age cohorts and things. These types of interaction effects appear to be ubiquitous and much more important even than we thought.

Starting point is 01:19:11 The good news is that we are getting better at identifying potential genetic influences on quantitative traits, but the bad news is that there just may not be very much we can actually say about them, other than quantifying it in the broadest sense. And there was a paper that I looked at recently, which I think described this well. So I've just read out the quote from there. We now know that gene expression depends on many factors, including environmental factors and complex interactions between these factors, is therefore not possible to distinguish and quantify the influence of one factor independently of the others.

Starting point is 01:19:42 In the analysis of complex traits, however, we just cannot say how much of the area of a rectangle is due separate. to each of its two dimensions. It's not possible to separate nature from nurture. Even if we add interaction terms in a statistical model, we cannot really capture the complex relation that exists between these factors." I think that really puts it quite well, that trying to distinguish between the variants due to genotypic variance and the variance due to environmental variants and then the variance due to the interaction, maybe that's possible for certain traits, but in most cases

Starting point is 01:20:13 it doesn't really make a lot of sense. At least for humans, in the living and uncontrolled environments. If you're engaging in things like selective breeding, then you can carefully control the genotype in the environment, you can have paired matings. Then this sort of analysis does make sense, and it's very useful for improving crop varieties. But for humans in natural environments

Starting point is 01:20:31 where we can't rigorously control all the aspects of the environment and we can't engage in selective breeding and so forth, then it doesn't really make a lot of sense to ask, what is the effect, how much of the trade is determined by genetics versus the environment, or even how much of the trade to turn by this gene compared to that gene. Because what it comes down to

Starting point is 01:20:51 is genes don't code for traits. They don't code for phenotypes. Genes code for RNA, which then codes for proteins. A protein has often many functions within an organism, and the functions only are possible in the context of other proteins doing what they're supposed to.

Starting point is 01:21:08 So you really need to go through the mechanistic pathway to understand how these things fit together. Some traits that are relatively simple, we can kind of skip steps, right? but for many traits and for many things complex traits of interest that's not possible and if we try to skip steps we'll just be very confused because what we're seeing is the result of a very complex set of interactions of genes and environment without understanding all the pieces in the middle so that's my take on this and that i think forms a natural endpoint to this episode so i hope you found this interesting uh if you did consider subscribing to the podcast or if you've already have you could also give the podcast a review a favorable review on the aggregator of your choice. I'm also very grateful to all of my generous patrons or other one-off

Starting point is 01:21:53 donors. You can make a donation via PayPal to my email address, Fodz12 at gmail.com. You can become a patron on my Patreon, just Google Science of Everything podcast, Patreon, if you would like to become a supporter. I'm very grateful to all of my backers. You can also just send me an email to ask questions or give comments or let me know about your listening experience. I always love to hear from listeners. So again, my email there is just Fodz12 at gmail.com. Thanks very much for listening. I'll talk to the next time.

The Science of Everything Podcast - Episode 146: Mendelian Genetics and Inheritance

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.