The Science of Everything Podcast - Episode 68: Protein Structure and Function

Starting point is 00:00:34 You're listening to The Sides of Everything Podcast, episode 68, protein structure and function. I'm your host, James Fodor. Recommended pre-listening for this episode is episode 18, Biochemistry Basics. Episode 10 on the Cell may also be slightly useful. So in this episode, we're going to look at proteins. We talked about them briefly in episode 18 and maybe it touched on them otherwise, but we're going to dive into them in a bit more depth in this episode. First I'm going to just talk a little bit about some of the methods that we're used to,

Starting point is 00:01:03 study proteins. And then I'm going to talk about protein structure and protein function. So that's just how proteins are structured, what they look like and how they fit together. And protein function is how they behave and what they do with a focus on looking at proteins as enzymes and protein foley. So let's jump straight in. First, as I said, we'll talk a bit about the methods used to study proteins because some of them are quite interesting. So when we want to study proteins, the first step that we need to do is obtain a pure sample of the protein. So just to remind you, proteins are a type of macromolecule found in all living organisms. They consist of a sequence of amino acids strung together, usually hundreds or thousands of amino acids,

Starting point is 00:01:48 and they fold up in a particular three-dimensional structure, which determines their function. And proteins are responsible for a very wide variety of functions in the cell, including acting as messengers and as transportation molecules and as signaling molecules and structural molecules, all sorts of things. Really, they're the workhorse of the cell. Most functions like that are performed by proteins of some form. So they're pretty important, so of course we want to be able to study them. How do we study them? The first step is usually obtaining a pure sample of the protein that we want to look at,

Starting point is 00:02:22 but cells don't just spit out pure samples of the protein for our pleasure. we have to purify the sample. So how do we do that? Well, usually the first thing that we have to do is obtain the protein in some solution by breaking the tissue or cells that contain the protein. So the protein is produced by cells, perhaps bacteria, or maybe it's cells that we've taken from a human sample or an animal sample or something like that. But either way, we have to get the protein out of the cell into solution. And there's a few ways of doing this. One way you can do it is by repeated freezing and thawing. You can fracture the cell membrane, or a a technique called sonication, which is using high-energy sound waves to rupture the cell membrane,

Starting point is 00:03:03 or you can apply certain chemicals which make the membrane or cell wall, if there is one, permeable. Any of these methods are basically to break up the cell wall and or, sorry, the cell membrane and or the cell wall, in order to get the protein out into solution. So once you've done that, we then need to find a way of extracting the protein from the solution, because there will be a whole bunch of other stuff in the solution, including many different types of proteins, probably various other solvents, nucleic acids and lipids and whatever else, we need to extract the protein that we want out of that sort of mess and purify it. So how can we do that?

Starting point is 00:03:38 Well, there's actually a very large number of different methods which can be used. I'll just talk about sort of free here, in fairly broad terms, to give you an idea. Usually the way that you purify a protein sample, or really a sample of anything, is to apply a number of these techniques in succession, because no single technique works perfectly. each purifies the sample a little bit more, removing more of the other things you don't want, and if you apply enough of them, eventually you get a sample that's pure enough, such that it is useful for further study. So the first technique that I want to talk about is centrifugation.

Starting point is 00:04:12 Basically, you put the solution containing your protein in some sort of beaker and spin it around really, really fast in a machine. The purpose of that is to separate out the particles in the solution by density, essentially because the centrifugal force, or centrifugal force is actually a fictitious force, as we talked about in a previous episode, but we'll just call it a force for now, pushes out, in a sense, the particles in solution with a force proportionate to its mass. However, there's also a resistance force on the particles in the solution, which is proportional to their size, basically because the water or the solution is sort of

Starting point is 00:04:49 pulling on them resisting them flying out to the bottom end of the beaker as they're spun around. that force is proportional to the size of the molecule, the bigger it is, the more sort of drag there is on the molecule, basically. So the outward force is proportional to mass, the inward force is proportional to size. So denser particles go pushed further away, and less dense particles stay closer to the top of the beaker, basically. And that's what centrifugation does. It just separates out particles by their density. So that's one way that you can separate out different types of proteins and other types of molecules. molecules as well. So the proteins will be at a certain layer in the centrifuge solution.

Starting point is 00:05:30 And therefore you can separate them out from things like lipids which have different densities or other solutes. So once we've got a bit of a pure sample but there's still a bunch of other proteins and other things in there as well, we can use a technique called column chromatography. Now chromatography is a general technique we use that can be used for separating out different particles in a solution. The basic idea of column chromatography is that we have some sort of, what's the way, called a stationary phase, which is sort of like some sort of gel that you put inside a column. So like a long beaker, basically, you can think of it. And then on top of that, you put what's called the mobile phase.

Starting point is 00:06:06 So there's a stationary phase, which sits there in a mobile phase, which you put on top. The mobile phase is a liquid. It's the solution that you're interested in purifying. It's got your protein in it. You put that on top, and either under the force of gravity, or you can apply external pressure to it as well if you want. The solution percolates down through the stationary face, through the sort of gel structure or whatever it is exactly. There's different things you can use. And basically, different particles or different substances in the solution are impeded in their travel down through the column by varying degrees, depending on what type of, depending on what they are.

Starting point is 00:06:44 So different proteins might be inhibited by different amounts because they have different masses or densities or shapes or whatever else. They interact with the stationary phase in slightly different ways, and therefore they travel at slightly different speeds. And as a result, the different substances in solution will separate out. And eventually, the most rapidly traveling bits of the solution get to the end of the column and drip out, and then you continually add more solution to compensate from that. But basically, what you do is you just keep forcing the solution down until you get to the bit that you want, the section of the solution that contains the protein that you're interested in, and then you just sort of collect that.

Starting point is 00:07:21 So basically the column chromatography separates out different particles or substances in the solution by how much they interact with the stationary phase, which, again, can be some sort of gel or something else like that. Okay, so we've applied centrifugation, we've applied column chromatography. Now we can use another technique called electrophoresis, which enables a sorting of molecules based on size. This actually is particularly useful for determining whether or not you've actually purified your sample of protein very, very well.

Starting point is 00:07:53 Basically, it's kind of similar to column chromatography. The main difference is we use an electric field to separate out, say, particularly proteins, different types of proteins, based on their charge. The charge of the molecules that we put on might be fairly similar, but the masses, of course, will be different. So the molecules will sort according to their masses, in particular, the larger the mass of the protein, the greater inertia it has, and so the longer it takes to move along the gel.

Starting point is 00:08:24 So you put them in a sort of a gel, and you apply the electric field, and you see how long it takes them to move. And basically they separate out according to the size, because the charges on them are all sort of similar, but smaller proteins move more rapidly because they have less inertia, and bigger proteins move more slowly because they have greater inertia. So you can separate out the proteins segments based on their mass, and that's another way of separating out proteins.

Starting point is 00:08:47 is also a way of determining whether a given protein sample that you have is pure or not, because you can compare them to known sort of samples and see whether you've got multiple different lengths of proteins or if you've only got one. That's electrophoresis, the basic idea behind it. Okay, so there's a few methods we can use to purify proteins. As I said, I'll just mention them again, centrifugation, column chromatography, and electrophoresis. But we use these methods, and through an iterated process, we eventually obtain a, relatively pure sample of protein. What do we do with that? One thing we might want to do with it is

Starting point is 00:09:23 determine its structure, the three-dimensional structure of protein, which we may not know. How can we do that? Well, we can use a technique called x-ray crystallography. How does this work? Well, a crystal is a regular array of atoms, or in this case molecules, arranged in a regular repeating pattern. And x-rays are just light rays with a sufficiently small wavelength. The reason we need a small wavelength is because we need a wavelength of light which is comparable to the bond lengths between the atoms in the proteins that we're studying. If we use bigger, like visible light, for example, which has a bigger wavelength, then we wouldn't have the resolution needed to examine the small structure, the small sides of the bond lengths. So we need to use x-rays or something, or similarly small wavelengths of light in order to study proteins the way we want.

Starting point is 00:10:09 So that's why we need x-rays. But how does the process actually work? Well, basically, the crystallized protein acts as a diffraction grating, producing a series of light patches, which you can then capture on a photograph and examine them, or light patches or dots. So, if this idea of a diffraction grating is unfamiliar, it might be helpful to review episode 32, light and optics, where I talk about these sorts of ideas, but basically a diffraction grating is some object that has a repeating periodic structure that allows. allows light to pass through parts of it, but not through other parts of it. So, I mean, you could think of a hair comb as, you could use that as a diffraction grating. It probably wouldn't be a very good one, but it has a repeating structure, and it has gaps where the light can pass through and other places where the light can pass through.

Starting point is 00:10:56 That's a diffraction grating. So, I mean, anything like that that has a repeating structure, you can pass light through, it could be used as a diffraction grating. The reason we call it a diffraction grating is because when light passes through those gaps, it defrax, it spreads out or bends. Again, look at the optics podcast in the past if you don't know what the fraction is. But when light passes through an aperture, it bends out and interferes with itself and produces particular patterns. And that's exactly what happens when we pass x-rays

Starting point is 00:11:23 through crystals. The light, in this case x-rays, passes through the gaps in the crystal, so where there's no atoms or where there's no proteins in this case, and diffracts around because essentially the gaps between the proteins or between atoms in the proteins act as holes in the diffraction grating. We need it to be a crystal because it needs to be a regular structure. If you just put a mess of proteins in there, you wouldn't get anything very interesting. They have to be in a sort of regular order structure

Starting point is 00:11:52 in order for this diffraction grating to work. So the light passes through around through the proteins, diffracts and bends and interferes with itself, and then it produces an interference pattern, which is a combination of light and dark splotches. In particular, usually you observe it as a lot of light dots in sort of particular patterns. Often there'll be circles and other things like that. It doesn't really look like

Starting point is 00:12:13 anything. Just a bunch of dots on a dark background. This is actually called a diffraction pattern, basically, and it tells us the structure of the protein. How does it tell it that? Well, there's a technique known as Fourier transform, which is a mathematical thing, which we won't get into. But basically,

Starting point is 00:12:31 the basic idea is that when you pass light through a diffraction grating, so these little holes in a regular pattern, the resulting pattern of light is the Fourier transform of the shape of the diffraction grating. The Fourier transform being a mathematical function, basically. And there's something called an inverse Fourier transform, which is another mathematical function we can use to go from the diffraction pattern back to the shape

Starting point is 00:12:57 that must have produced that, the diffraction grading shape that must have produced that. So in a sense, by looking at the pattern of dots that the crystal produces, we can determine what the structure of the protein must have been to produce that pattern of dots using this technical furrier transform. Now, don't worry if some of the details about that were a bit confusing. It is quite technical. As long as you understand the idea that by shining light through the crystal, we can get a pattern of dots

Starting point is 00:13:22 and then apply some fancy maths to that pattern of dots to get back to what the structure of the protein must have been to give that pattern of dots. That's the key idea. That's how X-ray crystallography works, in essence. In order for this to work, you do need to have, first a pure sample of protein, and then you have to crystallize it. Again, that's putting the proteins essentially in a regular ordered array,

Starting point is 00:13:41 and that's often quite difficult to do, because proteins aren't like that in biological environments. They're in a solution, so you have to crystallize them, and often that's tricky, particularly for membrane-bound proteins, that is, proteins embedded in cell membranes. You have to get them out of the membrane somehow, and often they're not stable outside the membrane, so they're particularly difficult to crystallize.

Starting point is 00:14:04 So they're a challenge. But basically, x-ray crystallography has been a very fruitful technique for determining the structure of proteins. The majority of protein structures that we know of have been determined by x-ray crystallography, and computer techniques for that are just getting better all the time. So it's a very handy technique to know about. Another concept that I want to mention which fits broadly into methods is proteomics. Proteome refers to the entire set of proteins that are found in a particular cell or, organism or something like that. This includes all of the different modifications and confirmations

Starting point is 00:14:40 and so on of the same proteins that may be found, say, in the cell. So the name is based on genome, which is a collection of all the genes in, say, a cell, well, a proteome is a collection of all the proteins. And there are more proteins than genes, because basically each gene codes for at least one protein, but usually it can code for many different proteins or forms of the same protein. So that's what a proteome is. Proteome is. Proteome means. derived from genomics is the study of a proteome, so including what proteins are there, their relative abundances and modifications, and how they interact, things like that. The human genome, so the set of all our genes, has about 21,000 protein encoding genes,

Starting point is 00:15:22 the exact numbers in dispute, but something like 20,000. But the total number of proteins in human cells is estimated to be, well, we don't really know, but hundreds of thousands, maybe up to a million. So, you know, at least 10 times larger than the number of genes, and maybe more than that. That's, again, because there are lots of ways that a given protein can be modified by adding things to it, or taking bits off it, or combining different proteins in different ways to form bigger proteins. There are basically, yeah, there's a lot of complexities which we need to go into, but you can modify, but a single gene can code for more than one protein, basically,

Starting point is 00:15:55 and we talked about this in some previous episodes, I think. So that's proteomics, and increasingly proteomics is becoming, sort of, combining with other disciplines, like for example, bioinformatics, which is using information technology and big databases and so on, machine learning to study biological systems, because in order to study proteins, we need heaps of data, because there are so many of them, and they're so complicated, and each protein has heaps of amino acids in it, and so on, and so using sophisticated computer techniques and databases and so on is very helpful for understanding the proteome. And so this is a very important growing branch of science, this sort of biotech

Starting point is 00:16:31 technology, proteomics, genomics, bioinformatics, sort of area. So I just thought I'd mention that, so make sure you've heard of the ideas. Now, that's enough on methods. Let's get into talking about protein structure. Now, in the previous episode, Biochemistry Basics, I talked about primary structure and amino acids and so on. I don't want to go over that again, so I'll just presume that you have some idea of that. Just to recap, the primary structure of a protein is just the sequence of amino acids, the particular order of amino acids that make up the protein.

Starting point is 00:17:02 Again, there could be hundreds or thousands of amino acids in a protein. The confirmation of a protein is the spatial arrangement of atoms in the protein. So not just the arrangement of amino acids in a chain, but the full three-dimensional spatial arrangement of atoms in the protein. That's its confirmation. And any protein can mean a very large number of conformations, although usually only one of them will be the most stable form, which we call the native form or the native confirmation that the protein

Starting point is 00:17:28 is found in biological environments. Occasionally there's more than one native confirmation that can flip between, but usually there's a sort of one that's the most stable and that's most preferred. But there are many other potential confirmations as well, and how the protein gets to the native state, to the sort of favoured confirmation, we'll look at when we talk about protein folding. So as you may recall, there are different levels of protein structure. There's primary structure, which we already talked about, the order of amino acids.

Starting point is 00:17:55 There's also secondary, tertiary, and quaternary structure, which we mentioned briefly in biochemistry basics. Just to recap, secondary structure consists of alpha helixes and beta sheets, which I'll talk about in more detail in the moment. And tertiary structure is sort of the trickiest one, which is the full three-dimensional arrangement of amino acids and also alpha-and-hylase and beta-sheets, alpha-helases and beta sheets,

Starting point is 00:18:20 and the full three-dimensional arrangement of all that stuff in the proteins. So how it all folds up and wraps together, basically. How is all this determined, though? it all gets very complicated. How do we get things like alpha helixies and beta sheets and the full three-dimensional tertiary structure of a protein? Well, in order to understand that, we have to understand what are called weak interactions, or also non-covalent interactions.

Starting point is 00:18:44 Again, we've talked about this in sort of previous episodes on chemistry. A covalent bond is a bond between two atoms where they share electrons, and that's a fairly strong type of bond. Those are found generally within molecules, so water molecule, for example, H. is held together by covalent bond between hydrogen and oxygens. The bonds that hold amino acids together are covalent bonds, also the bonds that hold the bind amino acids to each other are covalent. Weak interactions are non-covalent bonds that hold together the three-dimensional, the full

Starting point is 00:19:14 three-dimensional structure of a protein. In particular, weak interactions are mostly what is responsible for secondary and tertiary structures, and also quaternary structures of a protein. So they're not covalently bonded, but they're not covalently bonded, but they're are bonded through weak interactions. And I think I have mentioned these before. I'll just go through them sort of briefly so you have the idea. Well, one is ionic interactions. Ionic interactions are the attraction of ions that have permanent charges. So ions might be like sodium and chlorine. One's positively charged, one's an elyphid charged, so they

Starting point is 00:19:43 attract each other. Some amino acids are ionics, so they have charges, and those will of course attract each other, the positive and the negative amino acids will attract each other. So those, so if you had one amino acid that's positive over here and another one that's negative, over on this part of the protein, then they would attempt to attract each other and sort of wind up together. So that would be one mechanism you could have interaction along the protein. Hydrogen bonding is another very important weak interaction that's responsible for protein structure. A hydrogen bond is an interaction between a hydrogen atom and a highly electronegative atom. Usually it's oxygen or nitrogen or something, often oxygen.

Starting point is 00:20:20 So hydrogen bonding often occurs between water molecules or sort of groups on amino acids that are kind of like water atoms, basically, without getting into the details. Basically, the reason for this is because an atom like oxygen has a really strong pulling power for electrons, whereas hydrogen has a very weak pulling power for electrons. So oxygen tends to be sort of negatively charged, like partially negatively charged. Hydrogen tends to be sort of partially positively charged, so they attract each other. That's sort of what a hydrogen bond is. Again, I'm simplifying a bit here, but that's the basic idea.

Starting point is 00:20:51 And hydrogen, since there are lots of hydrogens and oxygens found in amino acids, hydrogen bonds are a very common form of bonding in protein structure. In particular, alpha-hillases and beta sheets are both stabilized, at least in part, by hydrogen bombs. So they're very important for the secondary structure and also for the tertiary structure as well. There's also something called London dispersion forces, or these sometimes are also called van der Waal forces, although that's slightly, it has a slightly different meaning, but basically this is induced dipole interactions whereby because of quantum mechanics, electrons will sometimes be found more on one side of an atom or molecule than another side. Basically, there's a random component there.

Starting point is 00:21:32 And these sort of temporary dipoles that are set up by this quantum randomness can lead to attraction. So maybe there happens to be more electrons on one side of the atom than the other in a fraction of an instant because of just pure chance, basically. So one side of the atom or molecule has a relatively positive charge, the other one has a relative of a negative charge. this leads to attractive forces. Bigger molecules tend to have bigger London dispersion forces compared to smaller ones, and this can be a component of attractive

Starting point is 00:22:01 forces in proteins. There's also what's called a hydrophobic effect. Now, hydrophobic molecules are molecules that don't like water, like put in quotes, because it's not an issue of attraction, obviously, it's an issue of being, it's an issue of

Starting point is 00:22:17 chemistry. But the basic idea is that certain molecules, non-polar molecules in particular, so lipids like fatty molecules, tend to be pushed away from water. They tend to like to sort of stick together. Non-polar molecules tend to stick together with each other. Polar molecules like water tend to stick together with each other. And I've discussed this in past chemistry episodes as well. So go back to those if you're interested in more of the chemistry, why that's the case. But basically, non-polar molecules tend to aggregate together and separate themselves away from water. So a common thing to happen in

Starting point is 00:22:47 protein structure is for those hydrophobic components of, say, the amino acids or whatever else, of which there are, you know, there are a bunch of, some amino acids have hydrophobic portions, and others have hydrophilic portions, bits that like to interact with water. So what tends to happen is the hydrophobic portions of the amino acids or the protein remain on the outside and the hydrophobic bits are pushed inwards because that reduces the, that allows the protein to achieve a lower energy state, basically, because these these hydrophobic bits are pushed away from water, and that's a more favourable energy state for them. So this hydrophobic effect tends to influence the three-dimensional shape of proteins by pushing away the bits that don't like to interact with water.

Starting point is 00:23:27 There's also something called a disulfide bond, which is basically just a bond between two sulfur atoms. I just mentioned this because this is very important for protein structure. It's a way of sort of, think of it as sort of like if you had two strings of rope next to each other. I mean, maybe it's one string of rope that's sort of coiled back on itself, that the rope represents the amino acid structure of the protein. If you then imagine sort of connecting it together by small strings that run transverse to the main rope, these are kind of like the disulfide bonds. They just sort of anchor the two strands together, the sulfur-disfar bonds, providing some extra structure to the molecule. So they're particularly important for giving it a sort of overall structure.

Starting point is 00:24:09 Anyway, so that's an overview of some of the weak interactions. that help to determine the secondary and tertiary structure of proteins. So just have those in the back of your mind as I talk more about secondary and tertiary structure, because that's sort of the chemistry that's underpinning what happens at these higher structural levels. Okay, so having discussed weak interactions a bit, now let me go through secondary and tertiary structure in a bit more depth than we did last time. Secondary structure basically consists of alpha helixies and beta sheets. These are common recurring patterns that we see in protein structure.

Starting point is 00:24:40 alpha helices is basically like a coiled helix, kind of like a DNA, if you imagine that, that's a helix. DNA is a double helix, though. Alpha helix is just a single one, so it's just sort of one rung going up there rather than two. I actually have a DNA. They can be right-handed or left-handed, depending on which way they're coiled, but it's the same basic idea. And the alpha helix is very common structure because it's an efficient way of burying the hydrophobic groups in the center. So if you imagine the coil, the hydrophobic bits go in the center, and so they're protected by the alpha helix coil all the way around. from the water which surrounds on the outside.

Starting point is 00:25:12 So that's one of the common elements in the secondary structure of proteins. The other one is the beta sheet, which consists of beta strands, which are sort of connected laterally with each other by hydrogen bonds, as I mentioned before. So you can sort of think of beta sheets as sheets, basically, because that's what they are. They're sort of flat two-dimensional structures, which are formed by essentially a zigzagging pattern of amino acids connected together in a sort of particular way.

Starting point is 00:25:39 again, and the zigzags are connected to each other by these hydrogen bombs. And you actually have an extensive network of hydrogen bonds which run through the beta sheet, keeping it all sort of rigid and stuck together in a sense, bound together. So that's the other main form of secondary structure that I want to talk about. When we talk about tertiary structure, which I'll now move on to, usually what we mean is how the different alpha helices and beta sheets fit together and also how the different bits of the protein as a whole fit together, because in a lot of proteins that you'll see,

Starting point is 00:26:12 a large number of amino acids are contained in either alpha helices or beta sheets, but there are also a bunch of amino acids that are not in either of those things. They're just sort of in random loops or other sort of unstructured bits of a protein. But those are often very important too for determining the overall three-dimensional structure of the protein, or in other words the tertiary structure. So let's look at that now. In order to understand tertiary structure, a very useful concept is what's called the motif. Now, a motif are simple combinations of secondary structure elements that frequently occur in proteins, in protein structure, and therefore we give them a name because they occur quite often.

Starting point is 00:26:50 They're also sometimes called super secondary structures. So they kind of fit in between secondary and tertiary structure, although that's not quite right, but it's good enough for our purposes here. So it's like a couple of alpha helices together, or an alpha helix and a beta sheet next to each other, or something like that. That would be like a motif. it's some common arrangement of secondary structures. So I'll just briefly mention a few of them. There's no particular reason that you need to remember them or understand their names or anything like that,

Starting point is 00:27:18 but just to give you a feel for it. So one is called beta hairpin, for example. This is two anti-parallel beta sheets connected by a small turn of a few amino acids. So, I mean, basically it's like one beta sheet, and then there's this small turn thing, and then another beta sheet going the opposite direction. So it looks like a hairpin, hence the name.

Starting point is 00:27:35 And that's a common structure that you often see in proteins. There's something called a Greek key, which is four beta strands connected together in a sort of a sandwich shape. There's a helix loop helix, which consists of an alpha helix connected to another alpha helix, bound by a loop of amino acid. So it's kind of like the beta hairpin, except composed of two alpha helices instead of two beta sheets. One of my favorites is the beta barrel, which, if you imagine taking a piece of paper, wrapping it around on itself to form a cylinder, that's basically what a beta barrel is it's a beta sheet wrapped around on itself.

Starting point is 00:28:07 And that, as you can imagine, forms a big hole through the middle. And that big hole or passageway can be used as exactly that. A passageway for, say, ions or other things, to pass through, for example, to transverse cell membrane. So beta barrels are common in membrane-ban proteins to allow things to pass through the membrane. So these are some examples of motifs, so common relationships or structures of secondary structure, which appear often in proteins. And often you can tell something about the function of that bit of the protein by looking at the motifs that are found there.

Starting point is 00:28:39 So if I see a beta barrel, I might think, oh, maybe that's a transmembrane protein, for example. Okay, so another concept that's useful in studying tertiary structure are called domains or structural domains. So a domain in a protein is a segment or a part of the protein that is sort of self-stabilizing and acts independently. A protein is composed of one or more polypeptides, so strings of amino acids. That's polypeptide. But within a given polypeptide, within a given string of amino acids, there are often different segments that sort of fold up and act independently. They are connected to each other because they're part of the same string, but they sort of act

Starting point is 00:29:20 separately in a sense. Connected but separate. And these are called domains. Domains seem to be particularly important in the evolution of proteins because it seems that domains are sort of evolutionarily conserved in that what you might find is a domain that was found in one protein, then appearing in a different protein alongside different domains. So it seems that nature has sort of used domains as a sort of a method of selection, so picking up domains and putting them over somewhere else where they might be useful. Often a domain

Starting point is 00:29:49 fulfills a similar function in different proteins in which it is found. As I mentioned, separate domains often fold up separately from each other and then sort of sit next to each other or whatever. By identifying the domains in a protein, we can often infer what the protein might do, or at least what part of it does. Now, you might be wondering how domains relate to motifs. Well, basically, a single domain might be made up of a bunch of motifs wrapped together or fixed together or structured together in some way. So if we imagine beginning with primary structure, which is just the order of amino acids, then we go up to secondary structure. That's your alpha helices and beta sheets. then motifs, which is several alpha helices or beta sheets,

Starting point is 00:30:31 organising in some particular regular way that we find a lot. And then we go up to structural domains, which is a bunch of motifs put together, again, that acts independently, and we often find in many different proteins. And that's a domain. And then all the domains within a protein will be the tertiary structure. And then there's one level of structure above that, which is a quaternary structure, which we'll get to in a moment. But hopefully you can sort of imagine this sort of sequence of hierarchy,

Starting point is 00:30:56 So we've sort of inserted two levels in between the secondary and tertiary, that is the motifs and the structural domains, although they're a little bit different. They're not quite new levels, but that might be one helpful way of thinking about it, because our domain is made up of a bunch of motifs, which in turn is made up of a bunch of alpha helixies and beta sheets and some other bits. Okay, so that's the structural domains.

Starting point is 00:31:16 There are four main types of structural domains, and basically this just is dependent on what type of secondary structures make them up. So there are ones that have all alpha helixies, ones that have all beta sheets, and there are also ones that mix the alpha and the beta sheets. And there's actually two types of those. There's ones where they just sort of mixed together, and there's another one where they're sort of separated. But we didn't worry too much about those.

Starting point is 00:31:39 But just to emphasize that there are different classes of domains, and there are lots of complicated ways of dividing them up and categorizing them and so on, we didn't worry ourselves about that here. Just to understand that they're made up of a bunch of motifs. They sort of operate and fold independently, and often fulfill a similar function, regardless of what protein you find them in. That function might be, for example, acting as a receptor or binding to a cell membrane or something like that, and it does that across multiple proteins, that the same domain is found in.

Starting point is 00:32:08 We can also, aside from classifying domains, we can classify proteins, like the whole thing. There are three main different types of protein classes, and these I will talk about in a little bit more depth, because the three classes are actually more useful for us to understand. one type are called fibrous proteins. These are basically very long proteins, which look like, well, fibers. They look literally like a strands of rope or something like that. Often there's sort of a few strands wound up together. And these play structural roles.

Starting point is 00:32:37 So you may have heard of keratin, collagen, elastin, of fibro. And these are all fibrous proteins. They're found in muscles, and also they support the structure of cells. I've mentioned them a few times in previous episodes. They form long filaments, which are shaped like rods or wires. they tend not to unfold as easily as other proteins because their structure is much simpler. They don't really have beta sheets or upper helices or domains or anything like that. They're just big long wires, basically, or rods.

Starting point is 00:33:03 And they play structural roles within cells. They provide protection and support. So they're pretty sort of simple proteins, if you like, but they're very important in, like, bones and tendons and muscles and things like that. Then there are the globular proteins, and I think this is the largest class of. The most proteins are globular proteins. They're called globular because they're sort of spheric. globe-like. They're just a big mush, basically. I mean, they're not a big mush. The structure

Starting point is 00:33:26 is very important. But if you just look at them, they kind of look like a big mush, like a meatball or something like that, or a misshapen meat, as opposed to the rods or wires of the fibrous proteins. Globular proteins are usually not very stable because the energy state of the folded native state of the protein is only sort of a little bit lower than other possible conformations. So they're not the most stable. So that means if you take them out of their environment or increase the temperature too much or change the pH or something like that, they're likely to misfold, as I'll talk about later. So they're not the most stable. They do need particular environments in order to work. But they perform most of the sort of protein

Starting point is 00:34:02 functions that we think about, aside from structure. There are some structural globular proteins, but mostly they act as enzymes, transported through cell membranes, messages, regulators, all of that sort of stuff. So we think about DNA polymerase, for example, which helps with the, or really any of the proteins that help with the replication and since the, of DNA and RNA and things like that. Most of those proteins are globular proteins. The third type of proteins are called membrane proteins because they're associated with biological membranes,

Starting point is 00:34:31 so the lipid sort of bubble that surrounds a cell. And many medicinal drugs target membrane proteins because they're particularly important for transmitting signals telling cells to grow or to die or to produce this protein or not produce that protein and things like that. And that's very important for drugs. Some of these membrane proteins literally go through the membrane.

Starting point is 00:34:54 Others are just sort of attached to the side of it. So there's ones that are permanently attached called integral membranes, and then there's peripheral membranes, which are sort of temporarily attached and can detach from the membrane. And as I said, they're very important as receptors and transporters.

Starting point is 00:35:07 Also cell adhesion molecules, so helping cells stick to each other or get separated membrane proteins are very important for that. They're also the hardest. M membrane protein is also the hardest to study, usually, as I mentioned earlier, because it's very hard to crystallize them, because you have to remove the lipid bilayor, the membrane, but when you do that, often the proteins aren't stable

Starting point is 00:35:27 anymore. They fall differently, basically. And therefore, what you're studying isn't the same protein anymore, so you've got to figure out a way of studying it without having the lipid membrane there, and that's very difficult. So, that's it as far as I wanted to say about the tertiary structure of proteins. Now I just want to say a few words on the quaternary structure, which is the highest level of protein structure that we usually talk about. So you remember that I said that proteins are made up of polypeptide chains, a polypeptide being just a big long string of amino acids. Well, proteins can actually be made up of more than one polypeptide chain. So separate chains will be synthesized separately, so they're coded for by different sequences of DNA and they're

Starting point is 00:36:08 made separately, but then they sort of join up together in larger assemblies. So each of these polypeptide change is called a protein subunit. You whack a few subunits together, you've form the overall protein. So hemoglobin, for example, the protein found in red blood cells, which transports oxygen around our body, is made up of four main subunits. Two copies of two different types of subunits. So some proteins have multiple copies of the same type of subunit, other ones have different subunits. Often though there's some sort of symmetry or repeating pattern in the subunits, because often they're symmetrical in different ways. And so in the hemoglobin, for example, we have two copies of each of those, and they sort of all are,

Starting point is 00:36:47 or sit next to each other, basically like a four-square pattern. And some proteins can have a very large number of polypeptide subunits, up to, you know, like hundreds, I think I even heard of adding some proteins. That's relatively rare. Usually they might be a handful or a few dozen or something like that. Now, the quaternary structure just refers to the relationship between these polypeptide subunits, how you fit them together. So not how you fit the amino acids together, that's actually primary structure,

Starting point is 00:37:13 it's how you fit the subunits together. That's cordonary structure, because you might be able to combine. them in different ways. You know, this subunit goes here and this one on top or the other way around or something like that. So that's quaternary structure. And again, it gives just another level of complexity in which proteins can change or be different if you put the subunits into different order or something like that. Right, so that's what I wanted to say about protein structure. We talked a bit more about conformations and secondary structure and motifs and structural domains, and we talked about protein classes and quaternary structure. This is all about the shape of proteins and what they look like. their structure. This is basically determined by x-ray crystallography and some similar techniques. But proteins also do things. They have a function, a biological function. So that's what I'm going to turn to now in the last part of the podcast and look a little bit at protein function. So what do they do? One important thing to understand about proteins, as I mentioned earlier, is that their

Starting point is 00:38:05 function is intrinsically and inherently related to their structure. Or, in fact, I would say that the function is determined by the structure. Proteins do a particular task because they have a particular shape. The structure is not incidental to the function, it's essential to it. And the structure of a protein is, in large part, determined by its folding. The confirmation that it adopts, remember, it's a native confirmation, the lowest energy confirmation that it has. That will be the confirmation that it has in its sort of a natural biological environment. Denaturation is a process in which proteins, or also sometimes nucleic acids, but here we'll just look at proteins, they lose their structure, which is present in their native state.

Starting point is 00:38:45 and therefore they move into a different confirmation. This could be losing its quaternary structure, its tertiary structure, or secondary structure, any of those, but not the primary structure. So denaturation is only the higher-level structures, secondary and up. If you actually change the primary structure, essentially you've either cut a protein in two or you've made a new protein by substituting out amino acid or something like that. So that's not what denaturation is. That's different. But any of the other structural changes, that's denaturation.

Starting point is 00:39:09 You're basically unfolding the protein. You're changing its structure. How can you do that? Well, applying any sort of external stress or change in its environment. So if you change the pH of the solution that it's in, or change the salt concentration, if you add an organic solvent like alcohol, for example, or if you apply radiation or heat, any of these can result in protein denaturation. And if proteins in a living cell are denatured, that means that the tertiary or secondary or quaternary structure of the protein is disrupted,

Starting point is 00:39:37 which means that its confirmation is different, which means it's not in its native state confirmation anymore, which means that it's not going to fulfill its biological function because the function is determined by the structure that it's in if you change the structure, you change the function. Usually you'll change it into something that's useless. It doesn't do anything or it doesn't do what you want it to. Perhaps it even does something harmful to the cell. And therefore, cell function is disrupted.

Starting point is 00:40:00 Basically, that's how you can kill a cell. You can heat it, and its proteins denature. And therefore, it can't function, and so it dies. That's one reason why human beings can't function in very high temperatures. Our proteins will be in nature. I mean, it's not the only reason, but that's one reason. Now, protein denaturing is actually one of the main things we do when we cook food, particularly meats, or meat-related products. So, for example, eggs.

Starting point is 00:40:23 Imagine frying an egg. You crack open the egg, and you have the mostly transparent egg white and the yellow egg yolk. But when you cook it, the egg white turns, well, white, and it also becomes hard. The consistency completely changes. What's going on there? I mean, the chemicals are the same. in the sense we haven't changed, we haven't added anything to it, we've just heated it up. So what's going on? What's actually happening? The main thing that's happening is the proteins

Starting point is 00:40:48 in the egg white and the yolk as well are denaturing. You're heating it up. The proteins lose their structure and so they change into a different form. And actually the, I forget the main name of the egg protein that does this, but it goes from being a soluble form, in which case it's dissolved in the solution and you can't see it. So that's why egg whites look transparent. it goes from being soluble to being insoluble when it's in its denatured form. So denature the protein becomes insoluble, and then it comes out of solution, and you see it. It's like if you have sugar in a glass of water or something, which comes out of solution. You see it.

Starting point is 00:41:23 Well, same thing with the eggs. You see the protein because it's come out of solution, and therefore the egg white turns actually white and sort of harder and rubbery. The consistency completely changes because you're denaturing the proteins. The structure has changed. Now, those proteins are going to be useless for whatever that. were intended to do, but of course that's fine. We don't care about that. We just want to eat them. There's the same thing that happens when we cook meat, it becomes tender, but because the proteins are denaturing, and sort of becoming less rigid.

Starting point is 00:41:49 Protein folding is the opposite of denaturation. It's the process by which proteins fold up come together in their native confirmation. Now, there is an astronomically large number of possible different confirmations that any protein could fold into, and it can't possibly try to. all of these because I mean if you do the calculations it would have to exist for tens of quadrillions of times longer than the universe has existed in order to try out all the possible different confirmations and find the right one so that's obviously not what proteins do when they're folding it's a bit of a mystery

Starting point is 00:42:23 exactly how proteins do this how they fold up in just the right way so they find that the the most stable confirmation given all the possibilities we're starting to unravel that puzzle now there's a few ideas which I'll talk about briefly but it is true that we don't have a full answer to that puzzle yet. However, one thing that seems relatively clear, or a general principle at least, is an idea called Anfinsen's Dogma, is named after the scientists who essentially came up with the idea, which is that the amino acid sequence of a protein determines its native confirmation, and really that's all that is necessary to specify the native confirmation, the usual form in which

Starting point is 00:43:00 it's folded up. All you need is the amino acid sequence. You don't need anything else external to that. Now, this is not, I mean, it's not true literally, because there are exceptions and you have to have the right environment, so pH and concentration and temperature and so one have to be correct in order for this to work. And sometimes you also need things called molecular chaperones, which are additional, usually, proteins, which sort of help the protein fold up in the right way.

Starting point is 00:43:23 So there's complexity there, but the basic idea is, I think, more or less right, that the main determinant of the shape of a protein is simply its amino acid sequence, which means that protein is in some sense self-folding. You don't need to add extra information to tell it how to follow, it sort of knows how to fall based just on what protein it is. Because the amino acid sequence determines what protein it is. It also determines how it folds.

Starting point is 00:43:46 So what protein it is determines how it folds. It's not like that you add extra information telling a protein how to fold. That's essentially what Anfinsen's dogma says. Now, how it happens, though, is, as I said before, still a bit murky. There's a few ideas. One is that there's a sort of a hierarchical process where, first of all, you have secondary structures forming, like the alpha helices and the beta sheets, and then the super secondary structures form, so the motifs,

Starting point is 00:44:10 and then afterwards the higher tertiary structures form, and then finally the quaternary structure. So it sort of goes in stages, and that would cut down the number of possible different confirmations a lot by going in stages like that. Another idea, which is not necessarily contradictory with the stages idea, is a sort of a folding funnel hypothesis, which is basically where the protein sort of moves down in an energy well,

Starting point is 00:44:35 if you like. This is a sort of metaphorical language. But the idea is it continually moves to lower and lower energy states, more and more stable confirmations, basically. And sometimes it can kind of get stuck. You can imagine if I dug a hole, and then on the side of the hole I sort of dug a small groove. Well, if I poured water into the hole, some of that water would get stuck in that small groove there, even though it could actually get lower by coming out of the groove and going

Starting point is 00:45:00 into the main part of the hole. But the thing is, we'd have to go up and up and down. That's what we call a local maximum, or in this case a local. or minima, actually. Basically, you could get lower, but you don't want to get lower, you have to go higher. First, you have to sort of climb the small hill to get down into the deeper valley. The idea is that the protein folding, the funnel that the energy funnel that the protein goes through in folding up might have a bunch of these local minima, which makes it a little bit harder to fold up and that sort of complicates things.

Starting point is 00:45:27 And maybe that's why you need the chaperones or something like that. But anyway, that's kind of complicated. The main point to understand is that folding is a spontaneous process. You don't have to add energy to get a protein to fold. You don't have to add information to get it to fold. It does require the right conditions and maybe some chaperones, but basically it folds by itself, which is quite remarkable, actually, and we still don't fully understand how that happens. And some proteins can actually refold, so if you unfold them, if you denature them, and put them back in the right environment, they'll just fold up again. Quite remarkable that they can do that. The other key thing to note about

Starting point is 00:46:00 protein function is that a lot of proteins are enzymes. Enzymes are catalysts. which I've talked about in previous chemistry episodes, so I won't get into the detail of that here, but basically a catalyst is just some substance which helps to speed up a chemical reaction without itself being used up in the chemical reaction. So enzymes are essential to sustain biological life because many of the chemical reactions that occur inside living organisms

Starting point is 00:46:24 would be far too slow without the help of enzymes, and most enzymes in living organisms are proteins. Often the way that protein enzymes work is by literally forcing the what are called the substrates, so the molecules that they act on, are called the substrates, or molecule or molecules that the enzyme act on are called the substrates, and the place, the part of the protein or enzyme where they react is called the active site. The active site has to be shaped in a very particular way in order to react or interact with the substrate. The preferred model of how that works at the moment is called induced fit,

Starting point is 00:47:03 a sort of a simpler model is called lock and key, where basically you think of the enzyme as a lock, and the substrate comes in as a key, and they fit exactly. They're just like a hand-in glove. They're a perfect fit for each other. And that's true for some enzymes, but more commonly, it's sort of an imperfect fit.

Starting point is 00:47:19 The enzyme isn't perfectly shaped to fit the substrate coming in, but it's sort of approximately shaped to fit it, and then as they start to interact, the fit improves. So you could kind of imagine as putting your hand in a glove, of which doesn't quite fit, but then sort of morphs itself a bit as you put your hand in, and eventually you get a nice perfect fit. So that's the induced fit model of substrate enzyme interaction. But the basic point is that you have some chemical, some molecule, whatever it is,

Starting point is 00:47:46 coming in to the active site and it binds to the enzyme, or it might be more than one, one or more than one substrate, binds to the active site, and then the enzyme does something to it. Often this is physically pushing it into a different confirmation which allows some reaction to happen, physically pushing different molecules together, for example, so they can combine or might be cutting a bond between them or something like that. And this dramatically speeds up the reaction between the two substrates, and then they're released, and then, you know, another lot of substrate comes in. Sometimes proteins need help in order to perform their enzymatic function.

Starting point is 00:48:18 So a co-factor is a non-protein compound which is required for a protein's biological activity. They're also called co-enzymes. They're sort of helper molecules. remember hemoglobin, the blood protein that we talked about that carries around oxygen? Well, the heem group refers to a group which is basically an iron atom with some other carbons and other things around it. It is not a protein. It's not made of amino acids, but it forms part of the big protein complex called hemoglobin. The iron atom is actually crucial because that's what, it turns out, that is what the oxygen

Starting point is 00:48:53 actually buy to. It's the iron bit. So if you take out heem from hemoglobin, The whole thing, I mean, it doesn't do anything. It's useless. So you need that heem bit, and particularly the metal atom in the middle there, in order to, in order for it to fulfill its useful function. But the heem bit is not actually a protein itself. It's not made of amino acids. So we call it a co-factor. It's associated with the enzyme. It's part of the enzyme, but it's not, strictly speaking, a protein. So it's sort of like a helper molecule. And there are many examples of these in biology.

Starting point is 00:49:19 Many of them are metals. So there's an increasingly popular and important field of research called metalloproteins, or metallobiological chemistry, which is studying how metals interact with proteins and biological systems, and they're increasingly seen as being very important. Okay, one final concept that I want to discuss called

Starting point is 00:49:41 Alasdoric Regulation. This is quite interesting. Regulation refers to basically determining how active, in this case enzymes are, whether they're active or not, whether they're doing their job or just sitting around idling. Now, one way that you can activate or deactivate a protein is you can put something in the active

Starting point is 00:49:57 site. You can cover up the active site, or you can insert something in there that's sort of like a dummy key. It fits in the lock, but it doesn't actually do anything. It doesn't unlock the door. It just sort of sits in there. You imagine inserting the wrong key into a lock, or it goes in, but it doesn't turn. That's one way you can deactivate an enzyme. But there's another way you can deactivate it, which is by interacting with an allisteric site. Now, an alasteric site is just another bit of the protein that's away from the active site. So the The active site's where the action happens, is where the substrates bind and stuff happens. The allosteric site's just somewhere else on the protein, anywhere else really.

Starting point is 00:50:34 The cool thing about allosteric sites is that they can affect the active site, they can do something to it, even though they're on a different part of the protein. And so what often happens is that something comes along, some molecule comes along, binds to the alistoric site, which is, remember, it's on a completely different part of the protein, and then that causes some change in the protein, some conformational change, change in its shape, or something like that, which in turn propagates through the protein and eventually has an effect on the active site. Perhaps it covers up the active side, or maybe it uncovers the active site, or it changes its shape, or something like that. So this is called allosteric regulation.

Starting point is 00:51:08 You don't target the active site directly. You target this alesteric site, which is on another bit of the protein. So it's sort of like an indirect attack. You don't go straight for the main thing you're interested in the active site. You attack it through sort of the back door and have an indirect effect, which then affects the active site. So allosteric regulation is increasingly important in drug design, because rather than looking at the active site where the interesting thing is happening, maybe that's where the pain receptors are firing or something like that, some crucial interactions necessary for the pain receptors to fire. But instead of focusing on the active site, we look at the alastery site.

Starting point is 00:51:40 Maybe it's easy to get drugs to target that bit, or maybe the alistoric site's more conserved across, say, the species of bacteria that we're trying to treat or something like that. Or maybe it could be useful in targeting cancer cells, because we can target that. their allosteric side of some protein necessary for cancer cells instead of the active site. So there's a lot of possibilities there, and it's very important for drug design increasingly to look at allosteric regulation. There are different types of allisteric interactions, homotropic and heterotropic, which basically the difference between those is, is the thing that binds at the alistoric site the same as the substrate that binds at the active site, or is it different?

Starting point is 00:52:17 So homotropic is the same thing, the substrate combined at the active side or combined of the allotropic site, alistrict. side, excuse me, heterotropic is it's different things that bind. So there are different types of allosteric regulation as well, but the same basic idea. All right, so that's all I wanted to discuss today. Hopefully you found that relatively interesting and not too hard. I recognize this was a bit of a more challenging episode because I did rely on a fair bit of background knowledge about biochemistry and there was also some optics in there.

Starting point is 00:52:46 But don't feel too bad if you didn't understand everything. Again, this podcast is designed so that you can listen back to an episode. multiple times. Maybe you've listened to it now, you picked up some things, you'd go back and listen to some previous episodes on optics and bi-chemistry and other things in chemistry, listen to this again in a few months, and you'll pick up more of it. So, hopefully you can get something out of it and not feel too worried that you didn't get everything. If you enjoyed the podcast, I'd be very grateful if you would jump onto iTunes and give the podcast a star review or a rating or both. It's very important and useful to increase the visibility of the podcast in the iTunes

Starting point is 00:53:20 store so that other people can find it. Also, you can visit us on Facebook. The podcast has a Facebook page where I post up news occasionally and also visual material to accompany the podcasts. So just type in the size of everything podcast into Facebook and you'll be able to find that and give the page a like, also a way of supporting the show. And also, if you'd like to send any feedback or questions or just tell me about when you listen or whatever, I love to hear from listeners. My email address is Fods12 at gmail.com. That's F-O-D-S-1-2 at G-M. email.com. Thanks again for listening and I'll talk to you next time.

The Science of Everything Podcast - Episode 68: Protein Structure and Function

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.