Expert Article Library

What Is DNA? (Part 1 of 2)

What Is DNA? (Part 1)

by Richard Saferstein, Ph.D.

By: Richard Saferstein, PhD.; Reprinted with permission from Dr. Saferstein.

[This article is in two parts. Click for Part 2. ]

Inside each of 60 trillion cells in the human body are strands of genetic material called chromosomes. Arranged along the chromosomes, like beads on a thread, are nearly 100,000 genes. The gene is the fundamental unit of heredity. It instructs the body cells to make proteins that determine everything from hair color to our susceptibility to diseases. Each gene is actually composed of DNA specifically designed to carry out a single body function.

DNA is a polymer. A polymer is a very large molecule made by linking together a series of repeating units. In this case the units are known as nucleotides. A nucleotide is composed of a sugar molecule, a phosphorous-containing group, and a nitrogen-containing molecule called a base.

Figure 1 shows how nucleotides can be strung together to form a DNA strand. In this figure, S designates the sugar component, which is joined together with a phosphate group to form the backbone of the DNA strand. Projecting from the backbone are the bases. The key to understanding how DNA works is to appreciate the fact that there are only four types of bases associated with DNA: adenine, cytosine, guanine, and thymine. To simplify our discussion of DNA, we will designate each of these bases by the first letter of their names. Hence, A will stand for adenine, C will stand for cytosine, G will stand for guanine, and T will represent thymine. [material omitted]. Keep in mind that in theory there is no limit to the length of the DNA strand; in fact, a DNA strand can be composed of a long chain having millions of bases.

The DNA molecule is actually composed of two DNA strands coiled into a double helix. This can be thought of as resembling two wires twisted around one another. As these researchers manipulated scale models of DNA strands, they realized that the only way the bases on each strand could be properly aligned with one another in a double helix configuration was to place base A opposite T and G opposite C.

The only arrangement possible in the double-helix configuration was the pairing of bases A to T and G to C, a concept that has become known as base pairing. Although A-T and G-C pairs are always required, there are no restrictions on how the bases are to be sequenced on a DNA strand. Thus, one can observe the sequences T-A-T-T or G-T-A-A or G-T-C-A. When these sequences are joined together with their opposite number in a double-helix configuration, they pair as follows:


Any base can follow another on a DNA strand, which means that the possible number of different sequence combinations is staggering! Consider that the average human chromosome has DNA containing 100 million base pairs. All the human chromosomes taken together contain about 3 billion base pairs. From these numbers we can begin to appreciate the diversity of DNA and hence the diversity of living organisms. DNA is like a book of instructions. The alphabet used to create the book is simple enough: A, T, G, and C. The order in which these letters are arranged defines the role and function of a DNA molecule.

Polymerase Chain Reaction (PCR)

Once the double-helix structure of DNA was discovered, it became apparent how DNA duplicated itself prior to cell division. The concept of base pairing in DNA suggests the analogy of positive and negative photographic film. Each strand of DNA in the double helix has the same information; one can make a positive print from a negative or a negative from a positive. DNA replication begins with the unwinding of the DNA strands in the double helix. Each strand is then exposed to a collection of free nucleotides. Letter by letter, the double helix is recreated as the nucleotides are assembled in the proper order, as dictated by the principle of base pairing (i.e., A with T and G with C). The result is the emergence of two identical copies of DNA where before there was only one. A cell can now pass on its genetic identity when it divides.

Until recently the phenomenon of DNA replication appeared to be only of academic interest to forensic scientists interested in DNA for identification purposes. However, this changed when researchers were able to perfect the technology of copying a DNA strand. This new laboratory technique is known as PCR (polymerase chain reaction). Small quantities of DNA or broken pieces of DNA found in crime?scene evidence can be copied with the aid of a DNA polymerase. The PCR technique is capable of yielding useful information with as little as one-billionth of a gram of DNA.

PCR begins with heating the DNA to separate the strands. Primers (short DNA segments of known bases) are then added to combine with the strands. An enzyme, DNA polymerase, capable of synthesizing a specific region of DNA, is added as well as a mixture of free nucleotides (G,A,T,C) to the separated strands. The enzyme directs the rebuilding of a double-stranded DNA molecule, extending the primers by adding the appropriate bases, one at a time, resulting in the production of two complete pairs of double-stranded DNA segments. This completes the first cycle of the PCR technique, and the outcome is a doubling of the number of DNA strands- that is, from one to two. This cycle is then repeated 25 to 30 times and over one million copies of the original DNA molecule is produced.

As an example, let's consider a segment of DNA that we want to duplicate by PCR:


In order to perform PCR on this DNA segment, short sequences of DNA on each side of the region of interest must be identified. In the example shown above, the short sequences are designated by boldface letters in the DNA segment. These short DNA segments must be available in a pure form known as a primer if the PCR technique is going to work.

The first step in the PCR process is to heat the DNA strands to about 94°C. At this temperature, the double-stranded DNA molecules separate completely:


The second step is to add the primers to the separated strands and allow the primers to combine or hybridize with the strands by lowering the test-tube temperature.


The third step is to add the DNA polymerase and a mixture of free nucleotides (G,A,T,C) to the separated strands. When the test tube is heated up to 72 °C the polymerase en-zyme directs the rebuilding of a double-stranded DNA molecule, extending the primers by adding the appropriate bases, one at a time, resulting in the production of two complete pairs of double-stranded DNA segments.



This completes the first cycle of the PCR technique, and the outcome is a doubling of the number of DNA strands-that is, from one to two. The cycle of heating, cooling, and strand rebuilding is then repeated resulting again in a doubling of the DNA strands. Upon completion of the second cycle, four double-stranded DNA molecules will have been created from the original double-stranded DNA sample. Typically, 25 to 30 cycles are carried out to yield over one million copies of the original DNA molecule. Each cycle takes less than two minutes to perform.


The key to understanding DNA typing lies in the knowledge that within the world's population there are numerous possibilities for the number of times a particular sequence of base letters can repeat themselves on a DNA strand. The possibilities become even greater when one deals with two chromosomes, each containing different lengths of repeating sequences. The most widely used technique for DNA typing combines PCR (Polymerase Chain Reaction) with STR (Short Tandem Repeat) analysis.

STRs are locations (loci) on the chromosome which contain short sequence elements that repeat themselves within the DNA molecule. They serve as helpful markers for identification because they are found in great abundance throughout the human genome. What is important to appreciate is that the repeating sequence is relatively short in length, three to seven bases, and that the entire strand of an STR is also very short; that is less than 500 bases in length. This means that STRs are much less susceptible to degradation and may often be recovered from bodies or stains that have been subject to extreme decomposition. In order to understand the utility of STRs in forensic science, let's look at one commonly used STR known as HUMTH01. The DNA segment contains the repeating sequence AATG. There have been seven HUMTH01 variants identified in the human genome. These variants contain five through eleven repeats of AATG.

During a forensic examination, HUMTH01 is extracted out of biological materials and amplified by PCR. The ability to copy an STR means that extremely small amounts of the molecule can be detected and analyzed. Once the STRs have been copied or amplified, they are separated on a electrophoretic gel. By examining the distance the STR has migrated on the electophoretic plate, one can determine the number of AATG repeats that exist in the STR. Every person has two STR types for HUMTH01, each inherited from one parent. Thus, for example, one may find in a semen stain HUMTH01 with six repeats and eight repeats. This combination of HUMTH01 is only found in approximately 3.5% of the population.

What makes STRs so attractive to forensic scientists is that there are hundreds of different types of STRs found in human genes. The more STRs one can characterize, the smaller will be the percentage of the population from which these STRs can emanate. This gives rise to the concept of multiplexing. Using the technology of PCR, one can simultaneously extract and amplify a combination of different STRs. For example, one system on the commercial market is the STR Blue kit. This kit provides the necessary materials for the coamplification and detection of three STRs - D3S1358, vWA, and FGA (triplexing). The design of the system ensures that the size of the STRs does not overlap, thereby allowing each marker to be viewed clearly on an electrophoretic gel.

Currently, in the United States, the forensic science community has standardized on thirteen STRs for entry into a national database, known as the Combined DNA Index System (CODIS). The thirteen STRs are listed in Table 1 along with their probabilities of identity. The probability of identity is the probability that two individuals selected at random will have an identical STR type. The smaller the value of this probability the more discriminating will be the STR. A higher degree of discrimination and even individualization can be attained by analyzing a combination of STRs (multiplexing). Because STRs occur independently of each other, the probability of biological evidence having a particular combination of STR types is determined by the product of their frequency of occurrence in a population. Hence, the greater the number of STRs characterized the more impressive will be the frequency of occurrence of the analyzed sample in the general population.


U.S. Caucasian

The combination of the first three STRs shown in Table 1 typically produces a frequency of occurrence of about 1 in 5000. A combination of first six STRs typically yield a frequency of occurrence in the range of 1 in two million for the Caucasian population, and if the top nine STRs are determined in combination, this frequency declines to about 1 in one billion. The combination of all 13 STRs shown in Table 1 typically produce frequency of occurrences that measure one in the trillions. Importantly, there are a number of commercially available kits which readily allow forensic scientists to profile STRs in the kind of combinations cited above.

Manufactures of commercial STR kits typically used by crime laboratories have made provisions to provide analysts with one additional piece of useful information along with STR types; i.e., the gender or sex of the DNA contributor. The focus of attention here is the amelogenin gene located on both the X and Y chromosomes (see p. xxx). This gene, which is actually the gene for tooth pulp, has an interesting characteristic, in that it is shorter by six bases in the X chromosome as compared to the Y chromosome. Hence, when the amelogenin gene is amplified by PCR and separated by electrophoresis, a male with an X and a Y chromosome will show two bands; a female having X and X chromosomes will have just one band. Typically, these results are obtained in conjunction with STR types.

[Part 2]

Continue to the next article in the What Is DNA? series