Tuesday, January 06, 2009

Evolution: genes as base sequences (DNA part 3)

Introductory narratives on genetics - and evolution and paleontology, for that matter - are often quite perilous. For every five sources that discuss the same concept, two can be incompletely informative, two can be actually misleading (frequently enough due to sloppy language), and one can be sufficiently lucid to cover the concept well. We didn't evolve from monkeys (as we know them), birds didn't evolve from dinosaurs (not the ones we know), and there isn't a 'gene for breast cancer' (as such; there has been identified a genetic mutation that can increase the risk of breast cancer).

My previous posts on DNA and genes have been somewhat meandering, on the whole. In DNA basics, I discussed DNA as the physical mechanism for storing genetic information. In part two, I discussed different types of DNA, and how genetic material can be shared through means other than species reproduction. I also discussed DNA exchange pertaining to both bacteria, and viruses (here and here).

DNA is the physical medium, and a fairly robust one it is, despite mutation. Its purpose is to store genetic information, express it, and reproduce it. This is about expression: the actual use made of the information store.

The DNA strand has combinations of four different bases, normally known as A, C, G, and T. Three consecutive bases (called a codon) can together direct the formation of an amino acid. There are 20 standard amino acids that codons can make, but at 64 possible combinations of three bases (4 x 4 x 4), several different codons can make up the same amino acid.

An amino acid schematic (analine)

These amino acids are typically combinations of about 10 to 20 carbon, oxygen, hydrogen and nitrogen atoms. Further chemical processes within a cell unite amino acids (in a process called translation, which is also directed by DNA-originated instructions) into polypeptides, which are the building blocks of proteins. The proteins are an end product of the expression of a consecutive set of bases within a DNA strand; those proteins then form an essential part in metabolic processes.

Thus a gene can be said to be "a sequence of nucleotide (base) pairs along a DNA molecule which codes for a polypeptide product [protein]" - this includes both sequences that code for the protein(s), and those that govern when it is expressed. However, calling it "the basic unit of heredity in a living organism" (as Wikipedia does) is vague enough to allow for multiple interpretations. I believe it is sufficient to refer to a sequence whose coding results in an outcome of protein(s) that then take part in other reactions. If the sequence is mutated, the protein outcomes may be slightly different or there may be no viable outcome. A 'gene for breast cancer' may refer to a sequence that, if mutated in certain ways, the end chemical outcomes could lead to cancer. Usually, however, there are multiple pathways to a cancer, and such loose talk refers to the discovery that mutation of a particular sequence has a statistically significant bearing on the cancer.

No comments: