More on RNA and DNA -

Previously, we formed a basic outline of what genes, RNA, and DNA are. Now we’ll take a closer look at how these impact our development and how their malfunction can lead to disease.

The DNA in all of our cells is the same. For the most part. It turns out that the DNA is different in different cells because although the linear sequence of ATCGs is the same there are chemical “marks” on the base pair nucleotides that vary in different cells. This forms the basis of an amazing story of complexity and faithfulness that is emerging right now and will have profound consequences to our understanding of both normal and disease biology. These marks form what is now called the epigenome and their study, which we will explore later, is called epigenomics.

The development of asymmetry leading to body plan involves coordinated expression of a large number of genes, many of which are highly conserved among various species. Those genes are coordinately turned on (or off) by transcription factors and the branches in paths described above are chosen in large measure by a choice between 2 transcription factors, behaving like a binary switch. To understand this we need to explore gene regulation, as it is understood today.

DNA packaging in the cell. Cytoplasm is pink and the nucelus is blue in the diagram of the cells. A dividing cell has replicated all of its DNA to make four copies of all the genes. The DNA double helix (bottom of the figure) is wound around little beads, proteins called histones, that limit access of transcription factors to the gene. The entire structure is densely coiled. The cell must package about two meters of DNA into a space about 10.5 meters across, and replication must faithfully copy three billion base pairs in about an hour. Remarkable!

Each of our cells contains about two meters of DNA coiled up into the small size of the nucleus (Figure 5). The DNA contains about three billion nucleotides consisting of a sugar (deoxyribose) and bases (A, T, C, and G; the letters of our genetic code). Since all cells have essentially the same DNA, what is it that allows the cells to be different from each other, to differentiate? The answer is that not all of the 25,000 or so genes embedded in the three billion nucleotides in a given cell at a given time are actually expressed and those that are define what the cell actually does and what it actually becomes.

The expression of genes is regulated by a complex set of machineries. The most important of these are transcription factors that are said to act in trans. That is, they are proteins coded for by genes and made in the cytoplasm of the cell that move to the nucleus, find a target gene, and interact with it, usually by binding or sticking to it. These proteins then recruit and bind other proteins to the gene and create a complex, ultimately binding a protein, RNA polymerase II, to complete the construction of the transcription machine. This machine operates as a little molecular switch that turns the gene on, that is, it causes it to be transcribed. The RNA polymerase unwinds the double helix and from one of the two strands it begins to build up an RNA molecule with a sequence complementary to the DNA. That is, where there is Cytosine ©, a Guanine (G) is added where there is a Thymine (T) an Adenine (A) is added, etc. according to the base pairing rule, except in RNA where the DNA has an Adenine, a Uracil (U; homologue of Thymine) is added.

This RNA molecule contains sequence complimentary to the entire sequence of the gene and is called pre-mRNA. The gene contains sequence, though, that does not code for amino acids during protein synthesis because they are removed from the RNA before it leaves the nucleus. These are called introns and there are special enzymes that process or “splice” these sequences out of the pre-mRNA, at which point it is called messenger RNA (mRNA). mRNA is exported from the nucleus to the cytoplasm of the cell where it connects to ribosomes and the mRNA sequence (A,C,G, and U) is used to build up the correct amino acid structure one amino acid at a time to make the protein that the gene (DNA) codes for. The code is a triplet of base pairs. Each amino acid is coded for by a set of three bases. There are three “letter” codes that start and stop the protein synthesis as well. The sequences that are spliced together from the gene are called “exons” because they are exported from the nucleus as mRNA.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Originally published at



Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Philip Iannaccone

Phil Iannaccone is a Professor of Pediatrics and Pathology at Northwestern University Feinberg School of Medicine.