Protein Synthesis — Breaking the Code -

Francis Crick (right) and James Watson (middle) with Maclyn McCarty (left). Image from Lederberg J, Gotschlich EC (2005) A Path to Discovery: The Career of Maclyn McCarty. PLoS Biol 3(10): e341 doi:10.1371/journal.pbio.0030341

DNA has four “letters” (a,c,t,g) with which to define proteins that are long chains of chemicals called amino acids (there are 20 amino acids). The proteins are what make a cell what it is, and the job of DNA is to “code” that information. We have 30,000 genes embedded in three billion “letters” and these are what cause us to look and be as we are. Or a frog to look like its froggy parents.

How can four nucleotides code for 20 amino acids, and code for something to start and stop the protein construction?

Let’s take a look at the math:

Three nucleotides (called a triplet) was proven by Francis Crick by mutating RNA to cause a frame shift. Two mutations inactivated the protein, three caused it to be active again (the frame was reestablished), and four once again inactivated the protein.

Let’s look at an example. Suppose we wanted to send a text message to a friend:

If the first two letters were omitted, that message would look now look like this:
“ECA THA DTH ERE DHA T” (2 mutations = unreadable)

If the first three letters were omitted:
“CAT HAD THE RED HAT” (3 mutations = readable)

And with four letters omitted:
“ATH ADT HER EDH AT” (4 mutations = unreadable)

adapted from Understanding DNA and Gene Cloning, Drlica, 1984

We can see that with two or four mutations (or omissions), that the message becomes unreadable, that is the protein becomes inactive. But, when there are three mutations, we can again read the message, meaning that the protein is active. We can then assume that these protein messages are sent in triplets.

Once it was known that three nucleotides defined the amino acid scientists worked feverishly to determine with painstaking methods to establish what three “letters” defined which amino acid.

In the following table we can see that the three “letters” (remember that in RNA the Thymidine nucleotide of DNA is substituted with Uridine during transcription) equal the amino acids abbreviated as defined below.

From Trends in Genetics

Notice there is more than one triplet that defines each amino acid except for methionine (met). Methionine is where the construction of a protein starts (its Start code). Notice also there are several triplets that stop the construction (Stop codes). This redundancy results in robust machinery for constructing proteins that we will return to later when we discuss the cellular machines that actually construct the proteins using the DNA code.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Originally published at



Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Philip Iannaccone

Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine.