DNA Damage - the major cause of missing pieces from the DNA puzzle

DNA is nature’s most widely used long-term information storage system. The elegant simplicity of the double helix belies the complex pathways that have evolved to copy, modify, and maintain the integrity of the genome. In the last 50 years insights into the structure of DNA and the enzymes that act upon it have led to the development of powerful tools for genetic analysis and engineering, including cloning and subcloning with restriction enzymes, DNA sequencing and amplification using PCR. As DNA methodologies have matured they have been applied to more diverse problems such as deriving genetic information from degraded samples, including blood and tissue samples. Gathering sequence information from these samples is critical in forensic identification (1), in the Consortium for the Barcode of Life initiative (2), in deducing the evolutionary relationships of living and extinct species (3), and in associative studies of tissue biopsy collections (4), to name a few. Such studies are experimentally limited by the quality and quantity of the DNA extracted from biological samples. In addition, compounds that inhibit the amplification of the small amounts of extracted DNA often co-purify with the DNA sample, further complicating analysis. Damaged DNA has therefore become an important experimental issue in many areas of research.


Common Types of DNA Damage

There are many common types of DNA damage that impact accurate replication by DNA polymerases (5). Furthermore, the degree and spectrum of DNA damage depends on the sample source and the type of environment to which it was exposed. Some types of damage are ubiquitous and can potentially be present in all extracted DNA, while other types of damage are the result of exposure to a specific source (see Table 1). Under physiological conditions the most labile bond in DNA is the N-glycosyl bond that attaches the base to the deoxyribose backbone.

This is in contrast with RNA in which the phosphodiester bond in the backbone is the least stable under the same conditions. Hydrolysis of the N-glycosyl bond results in the loss of a base leaving an apurinic/apyrimidinic (AP) site that itself eventually decomposes into a nick. Because the reactive species is H2O, AP sites are expected in all stored DNA samples. This includes lyophilized samples because it is very difficult to remove the final shell of H2O molecules immediately adjacent to the DNA.

Under metabolically active conditions it is estimated that approximately 2,000-10,000 AP sites are formed in a single human cell genome each day (5). This rate will vary from sample to sample, especially in samples taken from a crime scene because the type of environmental exposure will vary.

The presence of AP sites in a DNA sample is problematic for two primary reasons. First, genetic information is lost because the AP site cannot form a base pair with an incoming nucleotide during DNA replication. Second, typical PCR polymerases stall at the AP site preventing further replication (6). If enough AP sites are present, amplification or sequencing reactions will simply fail. The breakdown of AP sites into nicks further compounds the problem as it eventually leads to the fragmentation of the DNA.

Hydrolytic DNA Damage

Another common type of DNA damage that occurs under physiological conditions is the hydrolytic deamination of cytosine to form uracil (5). Sequencing studies on DNA extracted from very old samples, termed ancient DNA, have determined that this is the major damage complicating data analysis (7,8). Cytosine deamination, like AP site formation, is caused by hydrolysis and is probably present in the DNA extracted from many sources. Interestingly, unlike depurination, the rate of cytosine deamination is slowed in double-stranded DNA as compared to single stranded DNA.

The effect of deaminated cytosine in the amplification or sequencing reaction is polymerase dependent. Some polymerases (e.g. Taq DNA Polymerase) are able to effectively extend past the deaminated cytosine (i.e. uracil), inserting an adenine residue opposite the uracil instead of a guanine. This generates a mutated daughter strand even though the polymerase was 100% accurate. Alternatively, common proof-reading polymerases, including archaeal polymerases (e.g. Vent, Pfu, 9°N DNA Polymerases), stall at deoxyuracil encountered in DNA templates (9). The active site of these polymerases contain a binding pocket that specifically recognizes deaminated cytosine (10). This prevents the damage from creating a permanent mutation in the daughter strand. Because deamination of cytosine may result in inhibition of PCR or mutagenic DNA products, this is a particularly important issue in methods where DNA sequence is crucial. In contrast, methods that rely on the amplicon length rather than the exact sequence (i.e. short tandem repeats used in human identification) are not impacted by the mutagenic effect of cytosine deamination.

Oxidative DNA Damage

A third and common type of DNA damage is oxidation. As in the case of hydrolytic damage, most DNA samples are susceptible to oxidation, as they are exposed to oxygen throughout storage. Many types of base modifications are created by oxidation, but the conversion of guanine to 8-oxo-guanine is one of the most common (5). 8-oxo-guanine can base pair with adenine and is therefore a mutagenic product. Such damage is prevalent in mitochondria and may be one of the factors in the aging process (11). In studies that quantify oxidative damage it has been shown that the DNA extraction process itself can introduce this modification and therefore must be carefully considered (12).

Other Types of DNA Damage

Other types of damage become prevalent only in certain circumstances. DNA-protein or DNA-DNA crosslinks are a specialized, but important, type of damage that blocks the genetic investigation of an enormous number of stored samples. In both the museum and medical research communities a large number of samples are either stored in formalin (formaldehyde) or exposed to formalin at some point. The formalin-induced cross-linking effectively preserves structural morphology, but it is extremely detrimental to subsequent DNA analysis because crosslinked bases stall polymerases and DNA-DNA crosslinks can inhibit denaturation. In addition, the pH of formalin solutions drop over time due to the formation of formic acid, increasing the rate of AP site formation and subsequent fragmentation (13).

Other well-studied lesions that occur only in certain instances are pyrimidine dimers (14). These form when DNA is exposed to UV light and are very effective at stalling DNA polymerases.

In conclusion, there is a wealth of DNA sequence information contained in degraded samples; however, extracting that information is sometimes difficult. Whether the major difficulty is the efficacy of DNA extraction, the presence of PCR inhibitors, or the extent of DNA damage has not been fully determined. It is most likely a case in which the most prevalent problem varies with the sample and techniques that address all three possibilities are needed. Studies determining what types of damage are present in degraded samples and new methodologies to overcome them will hopefully make previously hidden information accessible.

Source of DNA Potential Damage Comments References
Ancient DNA abasic sites, deaminated cytosine, oxidized bases, fragmentation, nicks Cytosine deamination has been reported to be the most prevalent cause of sequencing artifacts in ancient DNA. Gilbert, M.T. et al. (2007) Nuc. Acid Res., 35, 1–10. Hofreiter, M. et al. (2001) Nuc. Acid Res., 29, 4793.
Environmental DNA fragmentation, nicks (plasmid or genomic) Nicks and fragmentation can increase the formation of artifactual chimeric genes during amplification. Qiu, X. et al. (2001) Appl. Envir. Microbiol., 67, 880.
Source of Damage      
Exposure to Ionizing Radiation abasic sites, oxidized bases, fragmentation, nicks Ionizing radiation is used to sterilize samples. Sutherland, B.M. et al. (2000) Biochemistry, 39, 8026.
Exposure to Heat fragmentation, nicks, abasic sites, oxidized bases, deaminated cytosine, cyclopurine lesions Heating DNA accelerates the hydrolytic and oxidative reactions in aqueous solutions. Bruskov, V.I. (2002) Nuc. Acids Res., 30, 1354.
Phenol/Chloroform Extraction oxidized bases Guanine is more sensitive to oxidation than the other bases and forms 8-oxo-G. 8-oxo-G can base pair with A making this damage potentially mutagenic. Finnegan, M.T. (1995) Biochem. Soc. Trans., 23, 403S.
Exposure to Light (UV) thymine dimers, (cyclobutane pyrimidine dimers) pyrimidine (6–4) photo products UV trans-illumination to visualize DNA causes thymine dimer formation. Cadet, J. et al. (2005) Mutat. Res., 571, 3–17. Pfeifer, G.P. et al. (2005) Mutat. Res., 571, 19–31.
Mechanical Shearing fragmentation, nicks Normal DNA manipulations such as pipetting or mixing can shear or nick DNA.  
Dessication fragmentation, nicks, oxidized bases   Mandrioli, M. et al. (2006) Entomol. Exp. App., 120, 239.
Storage in Aqueous Solution abasic sites, oxidized bases, deaminated cytosine, nicks, fragmentation Long term storage in aqueous solution causes the accumulation of DNA damage. Lindahl, T. et al. (1972) Biochemistry, 11, 3610 and 3618.
Exposure to Formalin DNA-DNA crosslinks, DNA- protein crosslinks Formaldehyde solution that has not been properly buffered becomes acidic, increasing abasic site formation. Workshop on recovering DNA from formalin preserved biological samples. (2006) The National Academies Press.

Note: The extent of damage caused by exposure to different reagents can vary, and its importance will depend on how the DNA is being used.

Click here for a list of DNA repair enzymes suitable for a variety of damaged DNA offered by NEB. Many of these recombinant enzymes can be produced on a large scale and are available for customized solutions.

View a table on some possible DNA samples and the impact of the damage they may undergo.


  1. Butler, J.M., (2006) J. Forensic Sci., 51, 253–265.
  2. Hajibabaei, M., et al. (2007) Trends Genet., in press.
  3. Noonan, J.P., et al. (2006) Science, 314, 1113–1118.
  4. Thompson, E.R. et al. (2005) Hum. Mutat., 26, 384–389.
  5. Lindahl, T., (1993) Nature, 362, 709–715.
  6. Sikorsky, J.A., et al. (2007) Biochem. Biophys. Res. Comm., 355, 431–437.
  7. Stiller, M., et al. (2006) Proc. Natl. Acad. Sci. USA, 103,13578–13584.
  8. Gilbert, M.T., (2007) Nucleic Acids Res., 35,1–10.
  9. Lasken, R.S., et al. (1996) J. Biol. Chem., 271,17692–17696.
  10. Fogg, M.J., et al. (2002) Nat. Struct. Biol., 9, 922–927.
  11. Weissman, L., et al. (2007) Neuroscience, in press.
  12. Hofer, T., et al. (2006) Biol. Chem., 387, 103–111.
  13. Kelly, K. (2006) Path to Effective Recovering of DNA from Formalin-Fixed Samples in Natural History Collection, (pp. 5–14). Washington, DC: The National Academies Press.
  14. Sinha, R.P., et al. (2002) Photochem. Photobiol. Sci., 1, 225–236.
From NEB expressions Spring 2007, vol 2.1
By Thomas C. Evans, Jr., New England Biolabs, Inc.