Objectives

Upon completion of this module topic, you should:

Be able to describe the basic structure of DNA.
be able to name the three basic parts of a nucleotide.B
Be able to explain how DNA differs from RNA.
Be able to explain the significance of nucleotide sequences.

Part a

DNA Technology

This is Part A, DNA Technology, under the module topic Content Overview. This topic part has two sections: Content Tutorial and Animations.

Content Tutorial

DNA Technology

A simple definition of genetics is the study of genes. Genes consist of threadlike molecule called deoxyribonucleic acid or DNA. DNA is a double helix composed of two intertwined nucleotide chains oriented in opposite directions. In the copying of DNA, the two chains separate and serve as templates for making identical daughter DNA molecules.

DNA can also serve as a template to make ribonucleic acid or RNA. The nucleotide sequence of RNA can be translated into an amino acid sequence of a protein. Proteins are the main determinants of the basic structural and physiological properties of an organism. Over the last two decades there has been an increasing barrage of articles in the medical literature as well as in the press promoting the uses of the new DNA technologies. Terms like DNA probes, plasmid profiles, DNA hybridization, DNA finger printing, Southern and Western blots and polymerase chain reaction appear in over 95% of articles in current issues of the Journal of Clinical Microbiology. While microbiologist have been the first to use molecular techniques in many clinical applications, uses in other disciplines are growing as well. Techniques utilizing DNA technology, whether straight hybridization or amplified technologies, are used routinely in many clinical laboratories these days, especially the larger or high-through-put infectious disease labs. For example, Gen-Probe’s PACE hybridization tests for gonorrhea and Chlamydia have been available since the late 80’s and are still used by many STD labs today.

Certainly molecular methodologies appear to be the way of the future. There is also a big push to get amplified technologies into the blood bank where they would be used for screening blood donations. Because the amplified technologies are much more sensitive than existing screening tests (usually EIAs), they have the potential to further close the window between infection and seroconversion where you run the risk of having a very low level undetectable but nonetheless infected unit enter the blood supply. Current commercially available kits are available for some viruses, TB, and a few other infectious diseases such as STDs, and there is a long lead time required to get a diagnostic test developed and through regulatory approval. Because of this, many of the bigger clinical labs, especially at the big teaching hospitals or high volume labs, are also willing to validate their own “home-brew” PCR assays for use as diagnostic aids. Not only do the big labs want amplified technologies, but they also want it automated. Not only do you need to be aware of the present impact of molecular biology, but think of a future that includes robotics, DNA and protein microarrays, microfluidics, certain types of genetic screening prior to the administration of drugs and testing for the expression of genes, not just their presence.

Animations

Please review the following videos that provide a basic overview on DNA, RNA, and Proteins.

A basic description of DNA, RNA, and Proteins.
From DNA to Protein (shadowlabs.org, Multimedia Document)
A description of how DNA copies itself.
How DNA Copies Itself (shadowlabs.org, Multimedia Document)

Part b

Biochemistry of DNA and RNA

This is Part B, Biochemistry of DNA and RNA, under the module topic Content Overview. This topic part has three sections: Content Tutorial, Animations, and Activities.

Content Tutorial

The Biochemistry of DNA and RNA

Deoxyribonucleic Acid (DNA) The structural unit of DNA is the nucleotide and DNA itself is a polymer of nucleotides. A nucleotide has three basis parts: 1. One of four possible nitrogenous bases DNA: purines – adenine and guanine pyrimidine – thymine and cytosine 2. Pentose sugar called deoxyribose 3. Phosphate group Each nucleotide is named according to the nitrogenous base, adenine for the base – adenosine for the whole nucleotide, guanine for the base – guanosine for the whole nucleotide, etc.

Nucleotides are linked together by phosphate diester bridges between hydroxyl groups of deoxyribose molecules of adjacent nucleotides. This leaves the nitrogenous bases hanging free.

The nucleotides in DNA exist as two strands, coiled to form a double helix. The direction of the sugar-phosphate backbone runs in opposite directions or antiparallel on the two strands. The strands are held together by hydrogen bonding between complementary bases. Pyrimidines (small) only bond with purines (large). Cytosine bonds with guanine and adenine bonds with thymine.

DNA double helix: Note the antiparallel backbone in the lower figure.

Structure of Ribonucleic Acid (RNA) RNA differs from DNA as follows:

1. The carbohydrate is ribose instead of deoxyribose.

2. The pyrimidine base uracil replaces thymine as a nitrogenous base.

3. RNA is found as single strands. [The RNA may have nucleotides within its sequence that base-pair with other bases in the same strand. This gives the RNA molecule its secondary structure.] 4. RNA can be found as 4 different types in 4 different roles; messanger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA) and ribosomal RNA (rRNA). In eucaryotic cells, DNA is found mainly in the nucleus while RNA is mainly produced in the nucleus but freely enters the cytoplasm. Primary transcripts are the newest RNA molecules directly as transcribed from DNA, and will not be found outside the nucleus. They contain introns and exons. The introns are spliced out and special treatments to make the RNA stable are added to the ends to make mRNA. mRNA is the molecule that provides the sequence for the making of proteins. rRNA and many protein subunits make the ribosomes. Ribosomes function in the cytoplasm and provide the translation machinery. tRNAs carry one amino acid, with each amino acid having one or more tRNA that carry it. Each tRNA molecule brings one amino acid to the growing protein strand in the ribosome.

University of Calgary Biotechnology Training Centre

Animations

Howard Hughes Medical Institute: Biointeractive Animations

DNA Packaging (www.hhmi.org, Multimedia Document)
Building Blocks of DNA (www.hhmi.org, Multimedia Document)
Chargaff’s Ratio (www.hhmi.org, Multimedia Document)
Paired DNA Strands (www.hhmi.org, Multimedia Document

Activities

1. Howard Hughes Medical Institute: Interactive Click and Learn Presentations

a. RNA Diversity (www.hhmi.org, Multimedia Document) – The RNA Diversity Click and Learn presentation includes 13 slides of information that you can click on the view to learn about the different types, structures, and sizes of RNA and many types of ribozymes.

b. RNA Interference (www.hhmi.org, Multimedia Document) – The RNA Interference Click and Learn presentation includes 16 slides of information that you can click on the view to learn about how double-stranded RNA mediates the effect of gene silencing, known as RNA Interference.

2. University of Utah’s Genetic Science Learning Center: DNA to Proteins

a. Build a DNA Molecule (learn.genetics.utah.edu, Multimedia Page)

Part c

The Biology of DNA/RNA

This is Part C, The Biology of DNA/RNA, under the module topic Content Overview. This topic part has three sections: Content Tutorial, Animations, and Activities.

Content Tutorial

Information Flow from DNA to Protein

DNA > RNA > Protein Central Dogma of Cellular Molecular Biology The hierarchy of cellular information; DNA to RNA to protein. DNA is replicated into more DNA; DNA is transcribed into RNA; RNA is translated into protein. Under normal conditions, this is the only direction of information flow. The processes of information flow are DNA replication in preparation for division of cells, transcription of the DNA into RNA and translation of RNA into protein in growing or active cells. A convention is used to write nucleotide sequences. The phosphate and (deoxy) ribose are the same for each nucleotide so the shorthand only indicates the bases. Starting on the left hand, the molecule has a free 5‘ end. The right hand end has a free 3’ OH group. This convention is based on the polymerization of DNA or RNA. The free 3’ OH is the site of the addition of the 5’ phosphate of the next nucleotide.

5’A – A – T – C – G – C – G – T 3’ DNA replication, RNA transcription and protein translation are accomplished by polymerizing enzyme complexes using the base pairing of the nucleotides on one strand as a template for the new strand. This can be demonstrated by the following: A few other important points are illustrated by the diagram below.

The DNA that encodes the protein is, by convention, put in the top line of sequence and is called the coding strand. The sequence is written left to right (5′>3’) in the same order that the protein itself is synthesized. The complementary strand is not required to be given in written sequences as it can always be deduced from the other strand because of the base pairing, but it is important to recognize that THIS strand is the one required to make the copy of the RNA and is thus called the template strand. The RNA is not called mRNA until the introns have been spliced out, leaving the exons, and the ends of the RNA protected by the addition of a cap and tail (polyA sequence – more than one A in a row). mRNA is always written 5’>3’ as well – a message that can be understood.
The coding ability of DNA with its 4 bases is obtained by the use of 3 bases to encode each amino acid that goes in a protein. The combination of 4x4x4 bases is 64 that encode for the 20 amino acids found in proteins. Thus more than one combination encodes for the same amino acid; such as the condons TAC and TAT both encode Tyr (tyrosine). Three 4 base sequences encode punctuation: AUG in the mRNA encodes for the first amino acid of the new protein and 3 other combinations encode a stop signal. Amino acids are sometimes also indicated by their one letter codes, such that the sequence above would be MSPVYYRYVA.

The figure above depicts 64 mRNA combinations of codons that translate into the 20 amino acids. goo For more information about the one letter codes for the Amino Acids, see Amino Acid codes. Information in a cistron/gene In prokaryotes and viruses, most of the genetic material encodes proteins, and even come in tail to tail or overlapping coding regions (open reading frames or ORFs). In eukaryotes, there are more regions of the DNA that do not encode for proteins than coding. These regions are important for gene regulation, packaging into the chromosomes, and attachment to the nuclear matrix for cell division and gene accessibility for transcription. In some clinical applications, the desired target is the gene, while in others it is the intragenic regions. There are several required sequences besides the protein-encoding region in DNA that are important. For replication, specific sequences called the origin of replication (ORI) occur in bacteria and their plasmids. Bacterial DNA is a closed circle, so replication starts at the ORI and continues around to the other side of the circle. In eukaryotes, the DNA in a chromosome is linear, and replication starts at multiple sites and continues in both directions until the pieces meet. For a gene to be transcribed, there are sequences upstream (more 5’ than the AUG that encodes for the protein) that initiate transcription. Transcription of genes is regulated by the binding of transcription factor proteins to sequences known as promotor regions. The transcription (txn) factor proteins are able to move from the cytoplasm to the nucleus in response to some signal or hormone, and then specifically bind to their promotor(s). RNA polymerase will bind the DNA and txn factor and RNA will be made from an initiation site downstream of the promoter. Transcription terminates after the stop codon of the gene due to the binding of special termination proteins or the secondary structure of the RNA itself. The primary RNA transcript encodes the information for introns to be spliced out leaving the exons and for the addition of the polyA tail making the mature mRNA.

For the mRNA to be translated, there is a ribosomal binding site upstream of the AUG start codon and translation is terminated because of the stop codon in the sequence. The ribosomal rRNA binds the mRNA in preparation for translation Three bases of the mRNA are recognized by 3 complementary bases in the tRNA and the ribosome catalyzes the addition of the tRNA’s amino acid to the one just previous to it.

The following information contains a few final pieces to mention about the sequence information in regard to the protein level. The protein transcript may not be mature after being translated. There are sequences at the N-terminus that are known as signal sequences or signal peptides. These often are hydrophobic (not attracted to water), and will insert themselves into the endoplasmic reticulum (ER) membrane. In this way, proteins that are destined to be secreted to the outside of the cell or remain membrane bound are identified from the cytosolic proteins (proteins destined to remain present in the cytosol) and pass from the cytosol to the lumen of the ER. The signal peptide is cleaved from the mature protein once it is in the ER. The protein will then traffic to the outer membrane, acquiring carbohydrate groups if it is a glycosylated protein, S-S bonds if it has disulphide bridges, etc. The hint given above about the Central Dogma suggests there are instances in which the central dogma is not followed. For example: 1) RNA viruses replicate their RNA by copying their RNA genome. 2) Retroviruses copy their RNA into DNA and then replicate by copying the DNA (reverse transcription). 3) Prions are self-replicating proteins. Prions do not encode for more prion proteins, but are able to mutate the shape of the normal protein into the prion form. Reverse translation remains a laboratory tool for some probe design applications, but due to the redundancies in the genetic code, is not the most accurate for designing probes for the detection of DNA or RNA. For more information about the Central Dogma, see Access Excellence. Amino Acid codes. One more convention, in the literature, it is common to write the name of the gene with a 3 or 4 letter code using italics, followed by a number or letter designating its order within a gene family or by discovery. Dominant genes are capitalized and recessive ones are not. Thus the hemoglobin A gene is HbbA. The protein is written hbbA (recessive) or HbbA (dominant), without italics.

University of Calgary Biotechnology Training Centre

Selection from the College Board Advanced Placement Biology Lab Manual for Students (2001)

“Restriction endonucleases or “enzymes” are essential tools in recombinant DNA technology. Restriction endonucleases or “enzymes” were first discovered in the 1960s by researchers studying bacteria. Restriction enzymes are found naturally in bacterial cells and function in protecting the bacterial cell from foreign DNA of other organisms, such as other species of bacteria or phages (viruses that infect bacterial cells). Restriction enzymes protect the bacterial cell by cutting up foreign and potentially harmful DNA. There are hundreds of different restriction enzymes that have been identified and isolated and are unique in their abilities to cut DNA molecules at a limited number of specific locations, a valuable property that has allowed gene cloning and genetic engineering to be made possible. There is a specific nomenclature used for naming restriction enzymes in which the letters refer to the organisms from which the enzyme was isolated. The first letter of the name stands for the genus name of the organisms. The next two letters represent the second word, or the species name. The fourth letter (if there is one) represents the strain of the organism. Roman numerals indicate whether the particular enzyme was the first isolated, the second, or so on.” (The College Board AP Biology Lab Manual (2001) Lab 6 Molecular Biology pp.68-69)

Examples:

HaeIII

H = Haemophilus
ae = aegyptus
III = third endonuclease isolated

EcoRI

E = genus Esherichia
co = species coli
R = strain RY13
I = first endonuclease isolated

Selection from Campbell & Reece, AP Biology, Seventh Edition “Hundreds of different restriction enzymes have been identified and isolated. Each restriction enzyme is very specific, recognizing a particular short DNA sequence, or restriction site, and cutting both DNA strands at specific points within this restriction site. The DNA of a bacterial cell is protected from the cell’s own restriction enzymes by the addition of methyl groups (-CH3) to adenines or cytosines within the sequences recognized by the enzymes. Most restriction enzymes recognize sequences containing four to eight nucleotides. Because any sequence this short usually occurs (by chance) many times in a long DNA molecule, a restriction enzymes will make many cuts in a DNA molecule, yielding a set of restriction fragments. All copies of a particular DNA molecule always yield the same set of restriction fragments when exposed to the same restriction enzyme. The most useful restriction enzymes cleave the sugar-phosphate backbones in both DNA strands in a staggered way. The resulting double-stranded restriction fragments have at least one single-stranded end, called a sticky end. These short extensions can form hydrogen-bonded base pairs with complementary sticky ends on any other DNA molecules cut with the same enzyme. The associations formed in this way are only temporary, but the associations between fragments can be made permanent by the enzyme DNA ligase. This enzyme catalyzes the formation of covalent bonds that close up the sugar-phosphate backbones. The ligase-catalyzed joining of DNA from two different sources produces a stable recombinant DNA molecule.” (Campbell & Reece, AP Edition Biology Seventh Edition, p.386)

Animations

1. Howard Hughes Medical Institute: Biointeractive Animations

a. DNA Replication (advanced detail) (www.hhmi.org, Multimedia Document)

b. DNA Transcription (advanced detail) (www.hhmi.org, Multimedia Document)

c. Translation (advanced detail) (www.hhmi.org, Multimedia Document)

d. Triplet Code (www.hhmi.org, Multimedia Document)

e. Coding Sequences (www.hhmi.org, Multimedia Document)

2. McGraw Hill Higher Education Biology Animations

a. DNA Replication Fork (highered.mcgraw-hill.com, Multimedia Page)

b. How Nucleotides are Added in DNA Replication (highered.mcgraw-hill.com, Multimedia Page)

c. Processing of Gene Information – Prokaryotes versus Eukaryotes(highered.mcgraw-hill.com, Multimedia Page)

d. How Spliceosomes Process RNA (highered.mcgraw-hill.com, Multimedia Page)

e. Protein Synthesis (highered.mcgraw-hill.com, Multimedia Page)

3. University of Utah Genetic Science Learning Center: DNA to Proteins

a. Transcribe and Translate a Gene (learn.genetics.utah.edu, Multimedia Page)

Activities

1. Cold Spring Harbor Laboratory DNA Interactive – Genome Complete Gene Feature & Gene Finding (www.dnai.org), Multimedia Page)

2. The Arizona Biology Project – Nucleic Acids Tutorials & Problems Click on the following website link to complete the Arizona Biology Project Nucleic Acids Tutorials and Problems. When you access the webpage, begin by clicking on “Begin Problem Set” to view Problem 1. For each of the 15 problems, you will be provided with a problem and possible solutions. For a more thorough review of content material, it is recommended that you click on the “Tutorial” option presented below each multiple choice before answering the problem. If you answer the problem correctly you will notified that you are “Correct”, however, if you answer the problem incorrectly you will be directed to the tutorial to review the content information before selecting another answer.
Arizona Biology Project – Nucleic Acids Tutorials (www.biology.arizona.edu, Multimedia Document)