Sunday, August 03, 2008

DNA coding - message in a bottle?

"DNA's performance as an archival medium is spectacular. In its capacity to preserve a message, it far outdoes tablets of stone. Cows and pea plants... have an almost identical gene called the histone h4 gene. The DNA text is 306 characters long... cows and peas differ from each other in only two characters out of these 306... fossil evidence suggests [their common ancestor] was somewhere between 1,000 and 2,000 million years ago... Letters carved on gravestones become unreadable in mere hundreds of years”
- Richard Dawkins, The Blind Watchmaker (p123).

The implications, at first glance, are quite spectacular too. For example, why not preserve meaningful information in such a fashion? We send out signals seeking contact with extra-solar civilisation; why not inscribe similar messages in DNA to likewise reach across the distance of time? The human genome, for one, is several billion base pairs long, plenty of space for encoding information that can be read as clear messages.

There are in fact several reasons why this is somewhat impractical.

First, we are already halfway through the effective life-supporting span of the solar system. If, for example, we were to take to the extreme this current artificially-induced extinction event (global warming and destruction of biodiversity), we may leave few species behind; humans would not ipso facto be the most robust of them. If we were to propel destruction back to the bacterial level, there could well evolve again life forms sufficiently complex to analyse and read such messages – but the timing would be quite fine. The gap between “Oh, someone's encoded a message for us in DNA” and the sun expanding to render the planet uninhabitable, could be so small that contingency might not allow for that rediscovery. A simple event on the scale of the KT event's meteorite can play havoc with such timing.

Second comes the inevitable problem with seeking to encode for two different – potentially conflicting meanings. (This is why database designers tend to create primary keys that are independent of specific data fields.) On the one hand, it would be tricky to code a section of DNA to be meaningful both genetically and as a message. And there is no guarantee that such genes would not be subject to evolutionary changes that obliterate the message.

On the other hand, large sections of genome are seen as “junk DNA”, that is, likely to be filling no purpose directly relevant to an organisms makeup. (which is not to say junk DNA is fully useless – for any organism to carry any excess baggage, there is a cost. We just don't know for sure the purpose and origin of junk DNA. It seems to consist of duplicates and misprints of DNA present elsewhere, rather like a computer's waste bin that hasn't been cleared.)

However, junk DNA looks to be more susceptible to mutation than purposive genetic material. Why? If mutation is steady and equally likely throughout the genome (say, for instance, that solar radiation causes a slow but steady rate of damage – a small percentage of miscoding – in haploid genetic material), DNA that has purpose is more subject to error-correction – via the decrease in viability of mutated, ie DNA-damaged, individuals. Thus junk DNA mutation – coding errors – at the individual level is more readily retained, and the information inherent in that junk code would change more frequently. An ideal vacant repository for information, but not as secure.

So a genetic designer could conceivably store non-core information in DNA, but couldn't reliably expect it to last through an evolutionary time scale. However, I can picture the technology being developed to enable insertion of signature or copyright information in junk DNA that would last the required human timeframe.


Dawkins, R (1986): The Blind Watchmaker. Penguin, London.

No comments: