DNA is the bedrock of our genetic identity. It encodes physical and mental traits in our genes and it is generally unchanging and the same in all cells. There are ‘epigenetic’ aspects, some of which are still puzzling, that appear to carry over between generations, which are not due to the modification of DNA sequence. (Other posts have explored epigenetics.) But DNA sequence is the default carrier of our genetic identity.
Although the DNA in our cells is largely unchanging and the same in every cell, neither of those features is strictly true. As cells grow and divide, their DNA sometimes changes. The type of change I’m thinking of is damage due to chemical or physical events that change the structure of DNA. DNA-damaging events are surprisingly common, and if the damage they cause is not repaired, or corrected, they can lead to permanent DNA sequence changes, in other words, mutations. Mutations can have effects on the biology of a cell and its descendants. They can also lead to disease, with cancer a prominent example.
Something that damages DNA is a genotoxic agent; cells are under constant genotoxic stress (1). Estimates are that every cell undergoes tens of thousands of genotoxic events a day. Fortunately, almost every damaging event is reversable, and is in fact reversed. As a result of an armamentarium of repair systems, the DNA in a cell almost always has the same, unmutated sequence at the end of the day as it had at the beginning.
Most people are aware that the environment contains DNA-damaging agents: we may think of tobacco smoke, or nitrates in processed meats, or gamma rays from outer space. All of these are indeed sources of damage to our DNA (space radiation may be of concern primarily to astronauts). But more than 90% of the DNA damage arises spontaneously.
The stability of DNA is relative — for example, RNA is much less stable, and would not make a good genetic repository for higher forms of life (viruses are another matter). Protein is also less stable than DNA. But human DNA floats in a warm sea of potentially reactive chemicals, including internally generated free radicals, that can cause the cleavage or alteration of chemical linkages in DNA. Like any chemical, DNA is of finite stability.
Since only a small percentage of our DNA codes for proteins, a randomly occurring damaging event will probably not hit a protein-coding or a gene-regulating region. (Scientists sometimes refer to DNA with no known function, as ‘junk’. The term is contested.) Also, many changes to coding DNA would not change the protein made from it; there are on average 3 codons for each amino acid, and changes of single bases in genomic DNA may have no effect on the sequence of the protein produced (there can be second-order effects).
But there will still be hundreds of DNA-damaging events each day within coding sequences, which, if not repaired, can be consequential. The feature that saves our genetic core is that essentially all DNA damage is repaired before it can lead to mutation. The energy and time required to repair DNA is part of the price of staying alive, which is equally true of bacteria, laboratory rats, daffodils, and humans.
How is DNA damage repaired?
The importance of DNA damage repair is underscored by the amount of cellular metabolism dedicated to it: of the 25,000 or so human genes, more than 160 are involved in DNA repair. They span a wide variety of biochemical reactions.
An example that illustrates the complexity of a DNA repair pathway is the replacement of a damaged cytosine (C) residue. Cytosine in DNA can be attacked by nitric oxide (which may be in the air or may be generated by nitrates and nitrites found in cured meats). In this attack, the amino group of C is lost: this deamination converts C to the base uracil (U). And if it is copied in the next round of DNA synthesis prior to cell division, the base adenosine (A) would be laid down in the complementary strand, not guanosine (G) (C pairs with G; U with A). Thus, a mutation.
U is not normally found in DNA — it’s a component of RNA, but not DNA. Its aberrant presence in DNA is detected by an enzyme specific for that function, which begins the repair process. First, that enzyme cuts the bond linking the U to the DNA sugar-phosphate backbone. Then another enzyme cleaves the backbone on both sides of the lesion. Then another enzyme adds the proper residue, C (guided by the G on the complementary strand of the DNA), and finally yet another enzyme seals the break. There is a variety of jobs, just as for the crew repairing a pothole in the road, where some workers cruise the streets looking for damage, some put up barricades and control traffic around the site, some hammer out the broken pavement, some pour in the fresh tar to repair the road. (I can’t think of a parallel to the guy in the suit and hard hat holding a clipboard.)
The 160+ human enzymes involved in DNA repair recognize breakage of chemical bonds, modifications of bases, and cross-linking of strands or bases within a strand, among other perturbations. Many of these enzymes make other contributions to cellular life as well: for example, the enzyme that seals the break after removal of uracil seals DNA breaks wherever they occur. All cells have a form of this enzyme (it’s called a ‘ligase’; it ligates the strands at the site of a break). Even some viruses code for ligase.
Indeed, the systems for DNA repair are similar, where they are present, in simple organism and in complex ones. Higher organisms like us have more of them. But we’ve learned a great deal about DNA repair from organisms as simple as the gut bacterium E. coli. E. coli has one-one-thousandth as much DNA as human cells, but some very similar DNA repair systems. And it was a lot easier working out the mechanisms of repair in bacteria and then looking for them in human cells, than going directly to the study of human cells themselves. Thank you, biological homology.
Because of the DNA repair systems, the number of damaging events that persist to form mutations is minute. An estimated rate of mutation is 10-40 per year for cells of various phenotypes (scattered over more than 3 billion base pairs) (2). During that year, the number of DNA damaging events taking place in the cell is in the order of 10 million; all but a couple of them per million events having been repaired.
The Checkpoint response
Entry of a cell into the cell cycle with damaged DNA can lead to a variety of problems, including cell death. So it’s not surprising that cells have mechanisms that check the integrity of DNA before allowing them to begin dividing. These mechanisms are part of the ‘Checkpoint Response’, in which the presence of damaged DNA activates signaling pathways that temporarily halt progression through the cell cycle until the DNA can be repaired.
Cancer cells often have defective checkpoint systems of one kind or another. It’s a source of power, but potentially also a weakness. For those cells, failure of the checkpoint allows the genetically-damaged cancer cell to progress and divide and possibly incorporate yet another mutation. (Another way in which checkpoints are important in cancer is described in a previous post; cancer cells can deliver false checkpoint signals to the cellular immune system and prevent it from attacking the tumor cells.)
Overwhelming DNA damage can also push cells through the normal checkpoint barrier. This may result from exposure to a massive genotoxic stress (think of the doomed firemen trying to extinguish the Chernobyl fire). Sometimes, if DNA damage cannot be repaired, a cell will commit suicide by initiating a death pathway.
Defects in DNA Damage Response (DDR)
Defects in the DDR are the primary drivers in several pathological conditions. For a genetic defect to be passed down through the generations, it must be present in the germ cells that go on to form sperm or eggs. Mutations in somatic cells, the cells of the developed body, can affect cells or tissues that descend from that cell, but they are not transmitted to subsequent generations. But somatic mutation is important; that’s how cancer starts.
Because cancer is a genetic disease — it is produced by mutations or gene dysregulation — it isn’t surprising that a failure of DDR results in an increased probability of cancers. Genetic mutations in specific DNA repair enzymes are the cause of the BRCA1 and BRCA2 conditions, which create a genetically-inherited susceptibility toward breast cancer. The defects in these conditions are due to an inability to repair double-stranded breaks in DNA, which can lead to tumorigenesis.
There are seven different DNA repair genes that can be defective in the related medical conditions Xeroderma pigmentosum, Cockayne Syndrome, and trichothiodystrophy. All these lead to an increased sensitivity to ultraviolet light. All show the same kind of defect in DNA repair, and cause a range of effects from freckling and burning of the skin by sunlight, to cancers of the skin, such as melanoma. The enzymatic defect in these conditions results in an inability to cut out a wrongly-incorporated nucleotide from DNA (this is called ‘mismatch repair’). When the UV light from the sun leads to a chemical change in a DNA base, it would normally be excised and replaced. In Xeroderma pigmentosum, for example, this repair is impeded, and the resulting mutation can lead to cancer.
There are other mutations that have a broad range of effects, including on DDR. Some of these are not in the enzymes that carry out DNA repair, but on proteins that regulate cell growth and division, and thus the response to damaged DNA. These conditions can result in neurodegenerative diseases, radiation sensitivity, infertility, and failure of the bone marrow to deliver blood cells, as well as a predisposition to cancer.
Dr. Lynch’s insight
A condition that illuminates the importance of the DDR in preventing cancer is the Lynch Syndrome, also known as Hereditary Nonpolyposis Colorectal Cancer, HNPCC. This condition is named for Dr. Henry Lynch, who first described it as a genetic basis for certain cancers in 1966. At the time, the favoured theory was that cancer was caused by viruses. Although some cancers are, it is now generally agreed that it is genetic defects, mutations, that lead to cancer (viruses that cause cancer also do it through a genetic mechanism).
Dr. Lynch observed families with a high incidence of certain cancers. For example, he knew of a family with an 80% chance of getting colon cancer, and at a younger age, than is usual. But in 1966 his was a lonely voice among the virus hunters. Convincing evidence that he was right came only in 1993, when the gene responsible for NHPCC was cloned and characterised (3). This research was complex, and brought together 35 scientists from twelve research centers in five countries.
In the search for the Lynch Syndrome (HNPCC) gene, genetic markers, consisting of short, repetitive ‘microsatellite’ sequences on chromosome 2, were linked to the presence or absence of the Lynch syndrome in cancer-prone families. This segment of DNA was cloned, and its sequence determined. The likely function of a gene in it was identified by comparison to the DNA database, which produced a hit on a yeast gene. That yeast gene itself was related to one in E. coli bacteria. In E. coli and yeast this gene coded for a DNA repair enzyme that corrects ‘mismatches’ in the DNA duplex. The different species, bacteria, yeast, and human, have similar enzymes directed at that same essential function.
The discovery of the genetic basis of HNPCC was revealing. It raised some questions, among them, why does this DDR defect lead to a large increased risk specifically of colorectal cancer? Perhaps a partial answer is that cells in the gut are proliferating all the time, and have an average lifetime of less than a week. This requires almost constant churning through the cell cycle, and lots of DNA synthesis, with a greater opportunity for error than in most cells, which live more sedate lives. In HNPCC, the defective enzyme fails to correct those mistakes, and a high rate of mutation, including mutations of tumor suppressors and oncogenes, is the result.
We live in the Goldilocks zone
Why hasn’t evolution led to a more effective system for eliminating DNA damage and preventing mutations? The answer, in part, is that without mutation there is no evolution, and the mutation rates today have presumably been selected as optimal for the human population. The genetic and energetic price of having even further anti-mutation mechanisms is probably also prohibitive, and in any case, no chemical system is going to be perfect; defects are inevitable. Geneticists have long recognized that if there are too many mutations per cell generation the system is unstable, and genetic selection will not work. We live in the Goldilocks zone that evolution has selected for us: not too much mutation, not too little.
- Abascal F, Harvey LMR, Mitchell E, et al.. Somatic mutation landscapes at single-molecule resolution. Nature 593(7859):405-410 
- Torgovnick, A. and Schumacher, B. Front. DNA repair mechanisms in cancer development and therapy. Genet. 6:157-172 (2015)
- Leach, F. S., N. C. Nicolaides, N. Papadopoulos, B. Liu, J. Jen, R. Parsons, P. Peltomaki, et al. Mutations of a mutS homolog in hereditary nonpolyposis colorectal cancer. Cell 75:1215-1225