by Richard Saferstein, Ph.D.
Get notified about new articles - join the ExpertPages Mailing List now
[This article is in
two parts. Click for Part 2. ]
Inside each of 60 trillion
cells in the human body are strands of genetic material called chromosomes.
Arranged along the chromosomes, like beads on a thread, are nearly 100,000 genes.
The gene is the fundamental unit of heredity. It instructs the body cells to
make proteins that determine everything from hair color to our susceptibility
to diseases. Each gene is actually composed of DNA specifically designed to
carry out a single body function.
DNA is a polymer.
A polymer is a very large molecule made by linking together a series of repeating
units. In this case the units are known as nucleotides. A nucleotide is composed
of a sugar molecule, a phosphorous-containing group, and a nitrogen-containing
molecule called a base.
Figure 1 shows
how nucleotides can be strung together to form a DNA strand. In this figure,
S designates the sugar component, which is joined together with a phosphate
group to form the backbone of the DNA strand. Projecting from the backbone are
the bases. The key to understanding how DNA works is to appreciate the fact
that there are only four types of bases associated with DNA: adenine, cytosine,
guanine, and thymine. To simplify our discussion of DNA, we will designate each
of these bases by the first letter of their names. Hence, A will stand for adenine,
C will stand for cytosine, G will stand for guanine, and T will represent thymine.
[material omitted]. Keep in mind that in theory there is no limit to the length
of the DNA strand; in fact, a DNA strand can be composed of a long chain having
millions of bases.
The DNA molecule is
actually composed of two DNA strands coiled into a double helix. This can be
thought of as resembling two wires twisted around one another. As these researchers
manipulated scale models of DNA strands, they realized that the only way the
bases on each strand could be properly aligned with one another in a double
helix configuration was to place base A opposite T and G opposite C.
arrangement possible in the double-helix configuration was the pairing of bases
A to T and G to C, a concept that has become known as base pairing. Although
A-T and G-C pairs are always required, there are no restrictions on how the
bases are to be sequenced on a DNA strand. Thus, one can observe the sequences
T-A-T-T or G-T-A-A or G-T-C-A. When these sequences are joined
together with their opposite
number in a double-helix configuration, they pair as follows:
T-A-T-T G-T-A-A G-T-C-A
A-T-A-A C-A-T-T C-A-G-A
can follow another on a DNA strand, which means that the possible number of
different sequence combinations is staggering! Consider that the average human
chromosome has DNA containing 100 million base pairs. All the human chromosomes
taken together contain about 3 billion base pairs. From these numbers we can
begin to appreciate the diversity of DNA and hence the diversity of living organisms.
DNA is like a book of instructions. The alphabet used to create the book is
simple enough: A, T, G, and C. The order in which these letters are arranged
defines the role and function of a DNA molecule.
Chain Reaction (PCR)
the double-helix structure of DNA was discovered, it became apparent how DNA
duplicated itself prior to cell division. The concept of base pairing in DNA
suggests the analogy of positive and negative photographic film. Each strand
of DNA in the double helix has the same information; one can make a positive
print from a negative or a negative from a positive. DNA replication begins
with the unwinding of the DNA strands in the double helix. Each strand is then
exposed to a collection of free nucleotides. Letter by letter, the double helix
is recreated as the nucleotides are assembled in the proper order, as dictated
by the principle of base pairing (i.e., A with T and G with C). The result is
the emergence of two identical copies of DNA where before there was only one.
A cell can now pass on its genetic identity when it divides.
recently the phenomenon of DNA replication appeared to be only of academic interest
to forensic scientists interested in DNA for identification purposes. However,
this changed when researchers were able to perfect the technology of copying
a DNA strand. This new laboratory technique is known as PCR (polymerase chain
reaction). Small quantities of DNA or broken pieces of DNA found in crime?scene
evidence can be copied with the aid of a DNA polymerase. The PCR technique is
capable of yielding useful information with as little as one-billionth of a
gram of DNA.
begins with heating the DNA to separate the strands. Primers (short DNA segments
of known bases) are then added to combine with the strands. An enzyme, DNA polymerase,
capable of synthesizing a specific region of DNA, is added as well as a mixture
of free nucleotides (G,A,T,C) to the separated strands. The enzyme directs the
rebuilding of a double-stranded DNA molecule, extending the primers by adding
the appropriate bases, one at a time, resulting in the production of two complete
pairs of double-stranded DNA segments. This completes the first cycle of the
PCR technique, and the outcome is a doubling of the number of DNA strands- that
is, from one to two. This cycle is then repeated 25 to 30 times and over one
million copies of the original DNA molecule is produced.
an example, let's consider a segment of DNA that we want to duplicate by PCR:
order to perform PCR on this DNA segment, short sequences of DNA on each side
of the region of interest must be identified. In the example shown above, the
short sequences are designated by boldface letters in the DNA segment. These
short DNA segments must be available in a pure form known as a primer if the
PCR technique is going to work.
first step in the PCR process is to heat the DNA strands to about 94°C. At this
temperature, the double-stranded DNA molecules separate completely:
second step is to add the primers to the separated strands and allow the primers
to combine or hybridize with the strands by lowering the test-tube temperature.
third step is to add the DNA polymerase and a mixture of free nucleotides (G,A,T,C)
to the separated strands. When the test tube is heated up to 72 °C the polymerase
en-zyme directs the rebuilding of a double-stranded DNA molecule, extending
the primers by adding the appropriate bases, one at a time, resulting in the
production of two complete pairs of double-stranded DNA segments.
completes the first cycle of the PCR technique, and the outcome is a doubling
of the number of DNA strands-that is, from one to two. The cycle of heating,
cooling, and strand rebuilding is then repeated resulting again in a doubling
of the DNA strands. Upon completion of the second cycle, four double-stranded
DNA molecules will have been created from the original double-stranded DNA sample.
Typically, 25 to 30 cycles are carried out to yield over one million copies
of the original DNA molecule. Each cycle takes less than two minutes to perform.
TYPING or PROFILING With STRs
key to understanding DNA typing lies in the knowledge that within the world's
population there are numerous possibilities for the number of times a particular
sequence of base letters can repeat themselves on a DNA strand. The possibilities
become even greater when one deals with two chromosomes, each containing different
lengths of repeating sequences. The most widely used technique for DNA typing
combines PCR (Polymerase Chain Reaction) with STR (Short Tandem Repeat) analysis.
are locations (loci) on the chromosome which contain short sequence elements
that repeat themselves within the DNA molecule. They serve as helpful markers
for identification because they are found in great abundance throughout the
human genome. What is important to appreciate is that the repeating sequence
is relatively short in length, three to seven bases, and that the entire strand
of an STR is also very short; that is less than 500 bases in length. This means
that STRs are much less susceptible to degradation and may often be recovered
from bodies or stains that have been subject to extreme decomposition. In order
to understand the utility of STRs in forensic science, let's look at one commonly
used STR known as HUMTH01. The DNA segment contains the repeating sequence AATG.
There have been seven HUMTH01 variants identified in the human genome. These
variants contain five through eleven repeats of AATG.
a forensic examination, HUMTH01 is extracted out of biological materials and
amplified by PCR. The ability to copy an STR means that extremely small amounts
of the molecule can be detected and analyzed. Once the STRs have been copied
or amplified, they are separated on a electrophoretic gel. By examining the
distance the STR has migrated on the electophoretic plate, one can determine
the number of AATG repeats that exist in the STR. Every person has two STR types
for HUMTH01, each inherited from one parent. Thus, for example, one may find
in a semen stain HUMTH01 with six repeats and eight repeats. This combination
of HUMTH01 is only found in approximately 3.5% of the population.
makes STRs so attractive to forensic scientists is that there are hundreds of
different types of STRs found in human genes. The more STRs one can characterize,
the smaller will be the percentage of the population from which these STRs can
emanate. This gives rise to the concept of multiplexing. Using the technology
of PCR, one can simultaneously extract and amplify a combination of different
STRs. For example, one system on the commercial market is the STR Blue kit.
This kit provides the necessary materials for the coamplification and detection
of three STRs - D3S1358, vWA, and FGA (triplexing). The design of the system
ensures that the size of the STRs does not overlap, thereby allowing each marker
to be viewed clearly on an electrophoretic gel.
in the United States, the forensic science community has standardized on thirteen
STRs for entry into a national database, known as the Combined DNA Index System
(CODIS). The thirteen STRs are listed in Table 1 along with their probabilities
of identity. The probability of identity is the probability that two individuals
selected at random will have an identical STR type. The smaller the value of
this probability the more discriminating will be the STR. A higher degree of
discrimination and even individualization can be attained by analyzing a combination
of STRs (multiplexing). Because STRs occur independently of each other, the
probability of biological evidence having a particular combination of STR types
is determined by the product of their frequency of occurrence in a population.
Hence, the greater the number of STRs characterized the more impressive will
be the frequency of occurrence of the analyzed sample in the general population.
13 CODIS STRs AND THEIR PROBABILITY OF IDENTITIES
combination of the first three STRs shown in Table 1 typically produces a frequency
of occurrence of about 1 in 5000. A combination of first six STRs typically
yield a frequency of occurrence in the range of 1 in two million for the Caucasian
population, and if the top nine STRs are determined in combination, this frequency
declines to about 1 in one billion. The combination of all 13 STRs shown in
Table 1 typically produce frequency of occurrences that measure one in the trillions.
Importantly, there are a number of commercially available kits which readily
allow forensic scientists to profile STRs in the kind of combinations cited
Manufactures of commercial STR kits typically used by crime laboratories have
made provisions to provide analysts with one additional piece of useful information
along with STR types; i.e., the gender or sex of the DNA contributor. The focus
of attention here is the amelogenin gene located on both the X and Y chromosomes
(see p. xxx). This gene, which is actually the gene for tooth pulp, has an interesting
characteristic, in that it is shorter by six bases in the X chromosome as compared
to the Y chromosome. Hence, when the amelogenin gene is amplified by PCR and
separated by electrophoresis, a male with an X and a Y chromosome will show
two bands; a female having X and X chromosomes will have just one band. Typically,
these results are obtained in conjunction with STR types.