Chemiotics: A chemical double entendre

Posted on behalf of Retread

A few chemists who were both literary and literal recently looked fairly silly on the pages of the New York Times and in the blogosphere. You can read all about it on Michelle’s Francl’s blog “”https://cultureofchemistry.blogspot.com/“>Culture of Chemistry”. See her post of 1 May ‘08 — “”https://cultureofchemistry.blogspot.com/2008/05/you-pronounce-unionized-as-un-ionized.html">How to tell if you’re really a chemist." To make a long story short, 3 chemically impossible organic molecules (5 bonds to carbon etc., etc…) spelled out the word SEX in a review of a book about (what else?) sex. The chemists missed the semantic forest while closely inspecting the chemical trees.

Is there anything inside the cell being read chemically two different ways? Yes there is, and it has implications for how we determine what in the genome is being worked on by natural selection and what is being left alone. If intron, exon, neutral selection and synonymous and nonsynonymous codons aren’t old friends, have a look at the first comment which will give you all the background you need (which is quite a bit).

People attempt to measure the rate of natural selection acting on proteins using synonymous and nonsynonymous codons in the same protein in different organisms (say hemoglobin for example). Positive selection is measured as the rate of nonsynonymous nucleotide substitution (Ka) per nonsynonymous site, relative to the underlying ‘neutral mutation rate’, which is given by the rate of synonymous substitution per synonymous site (Ks). Usually Ka is much less than Ks (as most new mutations aren’t helpful or are actually harmful — this is negative selection). Positive selection is implied by Ka/Ks greater than 1. However, strictly by chance, the ratio of nonsynonymous (Ka) to synonymous (Ks) amino acid substitutions is 2:1.

All very nice, but ESSs and ESEs are found in exons, and mutations of them will change alternate splicing (something a functioning cell has a great interest in). It’s easy to see how changing one nucleotide in an ESS or an ESE could render it more or less effective, while leaving the amino acid sequence of the underlying protein unchanged. In short, the ‘neutral mutation rate’ may in fact not be neutral at all (if it is in an ESE or an ESS). Or possibly switching one amino acid for another has nothing whatever to with the protein and everything to do with controlling alternate splicing.

Now, chemists are adept at doing all sorts of different things with the same structure. Think what organic chemists can do with a carbonyl group. But whatever they do is over and done with. In protein-coding genes, the same sequence can mean two different things without being chemically changed at all.

We are far from understanding all the things DNA can do in a cell. Less than 2% of our 3.2 gigabases of DNA codes for exons. Calling the 98% of the genome not doing so ‘junk’ is a vestige of the protein-centric era of molecular biology, just as calling changing one synonymous codon for another neutral. Both assume that the only thing that DNA does is code for protein.

The expressive power of language lies in its ambiguity not its precision. DNA may be similar as we uncover the languages it speaks. My guess is that there are more to be found.

One thought on “Chemiotics: A chemical double entendre

  1. Each of the amino acids found in proteins is one of 20 possibilities, each position of DNA (a nucleotide) is one of four possibilities, so 2 consecutive nucleotides aren’t enough (16 possibilities) while 3 are too many possibilities (44 to many in fact). Each of the 64 combinations of 4 nucleotides taken 3 at a time is called a codon. 3/64 codons don’t code for an amino acid at all — they are (inappropriately) called nonsense codons, and tell the cellular machinery to stop adding amino acids to the chain. 41 extra codons is a lot of redundancy so that some amino acids (leucine) have 6 distinct codons for them — these count as synonymous codons. Some amino acids have just one codon (methionine). Each choice of 3 nucleotides (a codon) codes for one and only one amino acid. Codons are therefore either synonymous or nonsynonymous.

    So changing one nucleotide for another in a codon may lead to a change in the amino acid it was coding for or it may not. If it doesn’t, the thinking was until a few years ago that natural selection shouldn’t care as the output (the amino acid sequence of the protein) won’t be changed.

    Our genes occur in pieces. Dystrophin is the protein mutated in the commonest form of muscular dystrophy. The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify. What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.

    All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids. Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.

    One final complication — alternate splicing. The spliceosome removes introns and splices the exons together. But sometimes exons are skipped or one of several exons is used at a particular point in a protein. So one gene can make more than one protein. The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing.

    Alternate splicing is not rare. [Proc. Natl. Acad. Sci. vol. 102 pp. 12813–12818 ’05] contains 7 references which variously estimate the amount of splicing of mammalian genes from 22 to 74%. What controls alternate splicing? Sequences in the gene for the protein itself. These sequences can be in either a given intron or a given exon and they can either enhance splicing or inhibit it. They are called ESS (for exonic splicing suppressor) or ESE (for exonic splicing enhancer). ISS and ISE have simlar meanings where I stands for intron.

    Now back to the main course…

Leave a Reply

Your email address will not be published. Required fields are marked *