17 October 2008...10:52 pm

Numerology and DNA…

Jump to Comments

This is a really short post that is a random thought I had while discussing genetics with a friend over dinner. Here’s the idea: DNA has 4 nucleotides that pair up thus as AT and CG. Now, there are 4 possible nucleotides, so lets use modular arithmetic (yep \mathbb{Z}_{4}).

If we randomly pick one of the nucleotides, e.g. A, and assign it some element in {0,1,2,3}, say 3. We then assign a weight 1 to T so the sum A+T=1+3=0 mod 4.

We consequently have C and G have weights 0 and 2 (it doesn’t matter which gets which). Observe its sum is C+G=0+2=2 mod 4.

Now DNA is a sequence of such pairs, but we have reduced it to a sequence of zeros and twos. If we normalize these values, we get a sequence of ones and zeros. Holy cow, this looks like a binary string!

Of course, DNA is such a huge source of data, you’ll probably end up with stuff like lines from Shakespeare translated to ancient Greek backwards, or some other phenomenal coincidence like that.

Just a random thought that I thought was interesting…what is also interesting is that the quadratic residue of each sequence is zero mod 4. That is:

(A+T)^2 = 0^2 = 0 mod 4

(C+G)^2 = 2^2 = 0 mod 4

Remember that if we have a number squared it is 0 or 1 mod 4. Why? Well, suppose we have an even number 2n, when we square it we get 4n which is 0 mod 4. If we have an odd number, 2n+1, then we see (2n+1)^2 = 4n^2 + 4n + 1 = 1 mod 4.

What does this mean? It means that this numerological DNA thingy involves squares of even numbers!

Perhaps if we treat it as a sort of code, there is an error-correcting mechanism built into it? Perhaps if it’s a Reed-Muller code, we can bust out some Galois theory and do some finite group analysis. Or if we treat it as a lattice, sphere packings could get involved. Or we could treat it as a sort of binary continued fraction? There’s quite a bit one can do from here!

Well, I’m exhausted, but I thought this random coincidence was worthy of note…

Post Script A thought just occurred to me too, if we treat every e.g. 16 digits (more generally N digits) as a sort of “floating point” data type, that is we have

d_0 2^0 + d_1 2^{-1} + d_2 2^{-2} + \ldots + d_{N-1}2^{1-N}

where d_1,\ldots,d_{N-1} are N consecutive ones and zeros from the genetic code. What do we have? Well, floating point numbers forms a nondistributive, commutative pseudo-Algebra (I say “pseudo-Algebra” because my intuition is that it is an algebra, but I have no proof of this). Coincidentally, the Griess Algebra is a commutative, nonassociative 8 dimensional algebra…perhaps one can contrive some sort of connection between DNA and the Griess algebra?

Remark: Next time I’ll ramble on about the floating point arithmetic as a formal algebra, or whatever it is. I’ll try to provide some pseudo-rigorous proofs…

Leave a Reply