Protein Synthesis — Breaking the Code

Philip Iannaccone
4 min readJun 6, 2017


Francis Crick (right) and James Watson (middle) with Maclyn McCarty (left). Image from Lederberg J, Gotschlich EC (2005) A Path to Discovery: The Career of Maclyn McCarty. PLoS Biol 3(10): e341 doi:10.1371/journal.pbio.0030341

May 23rd, 2013 (reblogged from

DNA has four “letters” (a,c,t,g) with which to define proteins that are long chains of chemicals called amino acids (there are 20 amino acids). The proteins are what make a cell what it is, and the job of DNA is to “code” that information. We have 30,000 genes embedded in three billion “letters” and these are what cause us to look and be as we are. Or a frog to look like its froggy parents.

How can four nucleotides code for 20 amino acids, and code for something to start and stop the protein construction?

Let’s take a look at the math:

  • 1 nucleotide (a,t,g,c) could define four amino acids (4¹ = 4)
  • 2 could define 16 amino acids (4² = 16)
  • 3 could define 64 amino acids (4³ = 64)

Three nucleotides (called a triplet) was proven by Francis Crick by mutating RNA to cause a frame shift. Two mutations inactivated the protein, three caused it to be active again (the frame was reestablished), and four once again inactivated the protein.

Let’s look at an example. Suppose we wanted to send a text message to a friend:

If the first two letters were omitted, that message would look now look like this:
“ECA THA DTH ERE DHA T” (2 mutations = unreadable)

If the first three letters were omitted:
“CAT HAD THE RED HAT” (3 mutations = readable)

And with four letters omitted:
“ATH ADT HER EDH AT” (4 mutations = unreadable)

adapted from Understanding DNA and Gene Cloning, Drlica, 1984

We can see that with two or four mutations (or omissions), that the message becomes unreadable, that is the protein becomes inactive. But, when there are three mutations, we can again read the message, meaning that the protein is active. We can then assume that these protein messages are sent in triplets.

Once it was known that three nucleotides defined the amino acid scientists worked feverishly to determine with painstaking methods to establish what three “letters” defined which amino acid.

In the following table we can see that the three “letters” (remember that in RNA the Thymidine nucleotide of DNA is substituted with Uridine during transcription) equal the amino acids abbreviated as defined below.

From Trends in Genetics

Notice there is more than one triplet that defines each amino acid except for methionine (met). Methionine is where the construction of a protein starts (its Start code). Notice also there are several triplets that stop the construction (Stop codes). This redundancy results in robust machinery for constructing proteins that we will return to later when we discuss the cellular machines that actually construct the proteins using the DNA code.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.


Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine. He is the Director of the Developmental Biology Program of the Stanley Manne Children’s Research Institute, which houses his active research lab. Dr. Iannaccone received his
baccalaureate degrees from the State University of New York (S.U.N.Y.)
College of Environmental Sciences and Forestry and Syracuse University.
He received his M.D. from S.U.N.Y. Upstate Medical Center and his Ph.D.
from Lincoln College, University of Oxford, England. Dr. Iannaccone has
served as Chairman of the Board of Scientific Counselors of the National Institute of Environmental Health Sciences and as a member of the National Advisory Environmental Health Sciences Council.



Philip Iannaccone

Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine.