Genes: Recipes for proteins
A gene is a unit of DNA that codes for a protein or part of a protein. To make a protein, the cell uses a gene’s precise sequence of nitrogenous bases (A, T, C and G) as a guide.
Essential for the survival of all living organisms, proteins come in a wide variety of forms and functions. For example, variations in pigmentation proteins are what give panda bears their unique black-and-white coat. The opsin proteins in the eyes of bees are sensitive to ultraviolet light and flower patterns invisible to the human eye, which is crucial for flower pollination.
The genome can be compared to a cookbook filled with recipes, as it contains many genes that each produce a particular protein.
Alleles: Recipe variants
When one or more of a gene’s nitrogenous bases differ, these variants are called alleles.
For example, eyes can have alleles that contain information to create brown, blue, green or grey pigment.
Gene expression
All of an organism’s cells contain identical DNA. However, only the sections that a specific cell needs to function are expressed. This is what makes cells (such as neurons or muscle cells) different based on their roles in the body.
The ways cells package their DNA depending on the sections they need to access is referred to as epigenetics. When a gene is activated, its DNA is transcribed into RNA. This messenger RNA is then translated by ribosomes into a functional protein.
Thanks to this precise regulation, each gene is expressed at the right time and in the right place in the organism.
Did you know?
The size of an organism’s genome does not correlate to its complexity.
For example, the human genome has around 3.2 billion base pairs, while the largest known genome belongs to the Tmesipteris oblanceolata fern, which has 160 billion base pairs!
Human | Paris Japonica | Tmesipteris oblanceolata | Nasuia deltocephalinicola (a bacterium) | Arabidopsis Thaliana |
3.2 billion base pairs | 149 billion base pairs | 160 billion base pairs | 112,000 base pairs | 115 million base pairs |
Learn more
Only 1% of our genome directly codes for protein synthesis. The rest, called the non-coding genome, plays a role in gene regulation. These DNA regulatory regions, such as enhancers, promoters and other sequences, regulate gene expression. In genes, promoters are binding sites located right beside the start site for transcription, which is when DNA is copied into RNA. Enhancers increase the activity of certain genes, sometimes at a distance from the transcription start site.
Proteins called transcription factors direct the timing and intensity of gene activation and deactivation, which is why some cells express certain genes more than others, why not all cells produce the same proteins, and why cells have different functions.
Studying and understanding these regulatory regions is an important part of genomics research.
Source: The Human Genome Project pieced together only 92% of the DNA – now scientists have finally filled in the remaining 8% (theconversation.com)
Protein synthesis
Cells synthesize proteins using DNA as a kind of guide or recipe book.
Steps in protein production:
- The DNA information to create a protein (the gene) is copied into messenger RNA (mRNA).
- The mRNA then leaves the nucleus, enters the cytoplasm and binds to a ribosome.
- The ribosome moves to the rough endoplasmic reticulum (RER), where it reads the mRNA sequence and assembles the amino acids into a polypeptide chain.
- Once complete, the chain is released into the RER, where it is processed and folded into a protein.
- Proteins are carried by transport vesicles from the RER to the Golgi body.
The Golgi body processes the proteins before sending them to the cell surface or into the cytoplasm. The Golgi body also adds chemical tags to classify the new proteins according to their final destination.
Proteins that will be secreted out of the cell or embedded into the cell membrane are packaged into transport vesicles that bud from the Golgi body. These vesicles fuse with the cell membrane, and the protein is either released from the cell or delivered into the membrane.
Protein structure
Proteins are made of amino acids that bind in a specific order to create polypeptide chains.
When a protein is being made, its polypeptide chain must fold up into a specific three-dimensional shape so that the protein can function properly and fulfill its role. An error in the protein’s amino acid sequence or shape can impact its function and cause a problem for the organism.
In eukaryotic cells, the first step in protein production involves locating the target gene, i.e. the DNA‑sequence “recipe” that codes for the desired protein.
Once the gene is located, the DNA sequence is copied into a molecule called messenger RNA (mRNA). RNA is similar to DNA but has only a single strand and has uracil (U) nitrogenous bases instead of thymine (T) nitrogenous bases. The process to create the DNA copy is called transcription, which is orchestrated by a number of proteins.
Unlike DNA, mRNA can leave the cell nucleus and travel to the ribosomes, which are structures in the cytoplasm that translate information from mRNA.
Codons: Three-letter codes
Codons are sequences of three nitrogenous bases that make up DNA that code for a particular amino acid. The order of the nitrogenous bases in each codon is very important. For example, C-A-C codes for the amino acid histidine, whereas C-C-A codes for the amino acid proline. As a comparison, the words “pare” and “pear” in a recipe book may have the same letters, but their different order means something important for how our recipe will turn out.
Ribosomes are organelles that “read” mRNA one codon after another to make the backbone for the protein (the polypeptide chain). Ribosomes are like cooks who read and interpret recipes to make proteins.
With each codon, an amino acid is added to the polypeptide chain until all the codons have been read and the ribosome reaches a stop codon on the mRNA. Stop codons mark the end of the coding sequence and signal to the ribosomes to stop synthesizing the protein and release the newly created polypeptide chain.
At the end of the translation process, the polypeptide chain may be further modified; for example, it may be folded into its functional three-dimensional structure or receive specific chemical groups. These are called post-translational modifications.
Alternative splicing is when a single gene codes for different proteins.
After DNA is transcribed into mRNA, the mRNA is spliced so that some of its sections (introns and exons) can be added or removed.
Exons are the coding sections of mRNA that determine the amino acid sequence of the final protein. Introns are non-coding sections that are not involved in forming the amino acid sequence.
Through this mechanism, the same gene can create different mRNA molecules that will translate into different proteins.
To illustrate splicing, let’s take the word EPICUREAN. If we remove some of its letters, we can form the words EPIC, CURE, PAN, RAN and so on. The same thing happens when genes are spliced. Some of the information is kept while other parts are removed to produce different copies of mRNA, which in turn produces different proteins.
A protein’s three-dimensional shape is what determines its function.
There are four different stages of folding that produce the protein’s final shape:
- Primary structure: A protein is initially translated into a long chain of amino acids called a polypeptide. In this form, the protein is not yet functional.
- Secondary structure: Next, the polypeptide chain folds in on itself. Hydrogen atoms in the chain interact with each other to form α-helices or β-sheets that give the protein stability.
- Tertiary structure: The protein then takes on its final three-dimensional structure through interactions between its amino acids. It becomes functional in this form.
- Quaternary structure: Some proteins have multiple polypeptides. A quaternary structure is the arrangement of these polypeptides.
More complex proteins consist of multiple linked polypeptides that are coded by different genes.
An example is hemoglobin, a protein in red blood cells that transports oxygen and has four subunits, each of which has a specific amino acid sequence that comes from different genes. To synthesize hemoglobin, multiple genes must simultaneously transcribe the protein’s different subunits, which are then assembled to form the final protein.
Proteins: The workers of the cell
Proteins are essential to the functioning of living organisms. They are highly varied and perform a wide array of biological functions.
Protein function
Proteins perform so many functions in living organisms, including as enzymes or as the main component of some hormones. Some proteins are used for cell-to-cell communication, while others transport molecules or play a role in defending the organism.
Enzymes | Enzymes trigger chemical reactions in living organisms and create the right conditions for these reactions to take place. | Lactase is an enzyme that lets us digest lactose. |
Hormones | Hormones are chemical messengers that send information to different parts of the body. | Insulin is a protein hormone that plays a role in glucose regulation. |
Signalling proteins | These proteins help with communication between cells, by binding to the cell membrane and sending signals to the cell, for example. | Membrane receptors allow cells to receive signals. |
Contractile proteins | These proteins are responsible for muscle contraction. | Proteins such as actin and myosin help muscles contract. |
Transport proteins | These proteins allow molecules to travel from point A to point B. | Hemoglobin is a protein that transports oxygen. |
Support proteins | Support proteins provide mechanical support for cells. | Proteins that form the cell cytoskeleton. |
Defence proteins | These proteins fight off external pathogens and promote wound healing. | Antibodies are proteins that recognize pathogens. |
Storage proteins | These proteins store up nutrient reserves. | This is thought to be the role of ovalbumin, the main protein in egg white. |
Proteins and phenotype
An organism’s set of observable traits, such as the colour of a plant’s flowers, is called its phenotype. These traits depend on the proteins that produce physical characteristics and the genes that code for them.
- Variations in pigmentation proteins are what give panda bears their unique black-and-white coat.
- A snail’s shell is hard because of the shell’s protein matrix.
- The proteins produced by some mushrooms are secondary metabolites with medicinal or toxic properties.
- Yaks have a mutation that increases their production of hemoglobin (a protein), which improves their ability to transport oxygen at a high altitude.
- The opsin proteins in the eyes of bees are sensitive to ultraviolet light and flower patterns invisible to the human eye, which is crucial for flower pollination.
- Some insects develop resistance to pesticides by producing enzymes (proteins) that neutralize these chemicals.
- Certain fish that live in icy waters produce antifreeze proteins that prevent ice crystals from forming in their blood and let them survive in extreme conditions.