Main

April 08, 2009

Chemiotics: Chemists — masters of the Cartesian dualism

Posted on behalf of Retread

People speak of information pretty glibly. Claude Shannon defined it as various combinations of bits (binary digits which can be ones and zeros) for electronics 61 years ago in a paper written about his classified work during World War II. Neuroscientists speak of information processing by the brain as the way it manipulates its input (a series of action potentials in nerve fibers which are about as close to Shannon's ones as you can get).

So that's what information is. But we really don't understand the entities (electronics, the brain) which actually do the processing terribly well. Consider first solid-state electronics, which catches Shannon's ones and zeroes. Just how well do we understand the solid state? Not very well according to Robert Laughlin, Nobel physicist, in his book "A Different Universe: Reinventing Physics from the Bottom Down". Quantum mechanics is now introduced to chemists in college, and I assume a course is obligatory in graduate school these days. Laughlin says it doesn't really matter in understanding the solid state, in the same way that the underlying chemical structure of the zillions of organic compounds which have been crystallized does not in any sense matter in explaining the crystalline state. All that matters, is that each molecule adopts the same shape regardless of what that shape is (this is why proteins are hard to crystallize). The book will make your head swim.

How about the brain? Do we understand it? Ask your friendly neighborhood neuroscientist why we need sleep, or better, exactly how and where in the brain memories are stored. You may hear a few mumbles about reverberating circuits or long term potentiation, but we really don't know. Although the brain has 10^10 neurons and probably 10^13 synapses (which is how neurons talk to each other), we can't use statistical mechanics to understand it. Amazingly, even in the case of the monatomic ideal gas, the atoms are assumed not to interact with each other (other than collide), and their energies are sufficiently low that electronic excitation isn't possible. Just as a list of the 10^23 positions and the 10^23 momenta do not explain the pressure of a gas, the list of what the 10^13 synapses are doing every millisecond, in addition to being incomprehensible, would not explain in any sense how and where memories are stored.

Where does chemistry come in? Consider the chemiotics posts of 9 Feb and 20 Jan. They say something profound about information, and not just in the cell. The information in DNA depends on how it's read (one way by the ribosome reading mRNA to make a protein, another by the splicing machinery to determine what mRNA is made, and a third way by microRNAs to determine how long the mRNA hangs around). Only through chemistry can the reader of the information be understood, and I think chemists understand the readers fairly well. I'm not sure if Shannon's concept of information entropy could even be applied to a DNA sequence being read 3 ways at once by different molecular machines. All discussions of information I've seen, pretty much ignore what's actually doing the reading.

Galileo famously said, "The universe cannot be read until we have learnt the language and become familiar with the characters in which it is written. It is written in mathematical language". Well, the information we have the best chance of understanding (because we understand the reader) is written in the language of chemistry. Thus do chemists stand astride the Cartesian dualism of materiality and the nonphysicality of information.

March 17, 2009

Chemiotics: Binding physicality rather than chemicality

Posted on behalf of Retread

Organic chemists love mechanism, subtlety and specificity. Books have been written about pushing arrows. Medicinal chemists are always worrying about making molecules which they can dock into either the active site or an allosteric site of a target protein. The fit must be quite close, and a recent post over at In the Pipeline notes that ‘You’ll have whole series of compounds that have to have a methyl group at some position, or they’re all dead. Nothing smaller, nothing larger, nothing with a different electronic flavor: it’s methyl or death.’

So making an organic molecule that responds to the physical properties of its surroundings – rather than the bonding structure of the molecules surrounding it – stands this sort of work on its head. As usual, nature got there first. Here are two examples.

Cells need to respond to the amount of cholesterol they contain, and make more if lacking. Cholesterol is poorly soluble in water, being found mostly in membranes. Here cholesterol functions as a fluidizer, making the long hydrocarbon chains of phospholipids and other lipids more disordered in order to fit around it. So cholesterol doesn't exist just to make pharmaceutical companies rich. A similar mechanism probably explains why unsaturated fatty acids (such as oleic acid) found in membranes have cis rather than trans double bonds (and in the middle of the chain to boot), making them harder to pack.

So if your membranes have less cholesterol they become stiffer. This stiffness is sensed in some way by several membrane embedded proteins (SCAP, INSIG1). SCAP then moves SREBP, another membrane embedded protein (along with its associated membranes) to another site in the cell where it is cleaved. It took years to figure out how water got inside the hydrophobic environment of the membrane to cleave (hydrolyze a peptide bond) SREBP. One of the SREBP cleavage products is then able to leave the membrane, migrate to the nucleus, bind to DNA and turn on genes in the cholesterol synthesis pathway. Elegant no?

A second example. The DNA in our cells is under constant chemical attack. Ultraviolet light produces cyclobutane dimers of adjacent pyrimidine nucleotides. Nucleotides fall off the backbone or have attached molecular fragments which alter their stereochemistry. Then there are the mismatches (an A or a T pairing with G rather C etc., etc.). Somehow, proteins scan DNA for these lesions (and find them). One such protein complex is DDB1/DDB2 (see here and here) which recognizes a very broad range of DNA lesions which are subsequently targeted for repair. DDB1/DDB2 binds to pyrimidine dimers (which distort the helix) and to DNA with crosslinked bases (e.g., due to cisplatin, psoralen), and also to DNA lacking nucleotide bases (just the opposite of crosslinked DNA).

How can one protein complex do all this? One theory has it that DNA lesions are recognized by their increased flexibility (because of decreased stability of base pairing and stacking in damaged DNA). This enables DNA lesion finding protein complexes such as DDB1/DDB2 to target a broad range of DNA pathologies for repair (without recognizing them specifically). They are binding to the effect of chemistry, rather than the chemistry itself, e.g., they are binding to a physical property of damaged DNA rather than its chemical structure.

Only the chemist can fully appreciate the wonder of what's going on under the cellular hood. In this we are fortunate, even if regarded as somewhat grubby by everyone else. Pascal's thinking reed and all that.

February 09, 2009

Chemiotics: The further uses of redundancy

Posted on behalf of Retread

Remember noncoding DNA? For protein that is. That's 98% of our genome. It now appears that at least half of our genome is transcribed into RNA. Is this a case of transcription machinery gone wild? One type of RNA made from the 98% is called microRNA (after it is cut from a larger precursor). MicroRNAs are only 21-23 nucleotides long. They aren't used to make proteins (which would be at most 7 amino acids long anyway). Instead they bind to complementary sequences in messenger RNA by classic Watson-Crick base pairing, and inhibit the translation of the mRNA into protein by the ribosome. So although microRNAs don't code for proteins, they help determine how much of them are made.

Until recently, microRNA binding to mRNA was thought to occur at the tail end (which does not code for protein). Two recent papers show that microRNAs also bind to the amino acid coding sequences of some proteins [Nature vol. 455 pp. 1124-1128, 2008 and PNAS vol. 105 pp. 20297-20302, 2008]. Change one synonymous codon to another, and the microRNA no longer binds and the level of the protein changes. So this is the third code written into our DNA.

What's so remarkable about that? Pop a DVD of a movie into a player. You are given choices of subtitles, language, etc... All these modalities are coded on separate tracks and blended together by the player after you choose. DNA is just one track and is coding for subtitles, sound and pictures by the same sequence of nucleotides. A given DNA sequence is capable of being read at least 3 ways — amino acid, exonic splicing enhancers and inhibitors, and microRNA — (and who's to say that these are the only ways DNA can be read).

The examples in the Nature paper are far from trivial as they involve Nanog, Sox2 and Oct4. So what? These three genes are crucial for stem cell function, and with a fourth have been used to transform normal cells into 'stemlike' cells (induced pluripotent cells — iPSs). What could be sexier than that? MicroRNA-control of these proteins has to be important.

There has recently been a good deal of interest in diversity oriented synthesis of small molecules — see [Nature vol. 457 pp. 153-154, 2009] and the ‘In the Pipeline’ blog post of 20 Jan, along with the more than 40 comments it brought forth. The hope is to create a wider variety of small molecules which can interact with proteins than we've been used to — and which might be useful drugs.

Continue reading "Chemiotics: The further uses of redundancy" »

January 20, 2009

Chemiotics: The death of the synonymous codon

Posted on behalf of Retread

For years, stretches of DNA not coding for protein were called noncoding DNA. As we came to know more about DNA, sites coding for just where the transcription of DNA to messenger RNA (mRNA) should begin, along with the DNA coding for the RNA in ribosomes were grandfathered in. Then about 30 years ago, we found that most genes coding for proteins contained large stretches of DNA not coding for amino acids at all.

Dystrophin, the defective gene causing Duchenne muscular dystrophy contains 3685 amino acids, but the gene stretches over 2.2 million contiguous positions in DNA. It only takes 11,055 positions to code 3685 amino acids. However the 11,055 occur in 79 stretches (called exons), separated by 78 much larger stretches of DNA (called introns). The whole 2.2 megaBases is transcribed into mRNA and then the introns are lopped (spliced) out by a gigantic protein and RNA machine called the spliceosome, a molecular machine even larger and more complicated than the ribosome (300 proteins, 5 RNAs [see: Science vol. 307 pp. 863-864, 2005]).

Ever since the human genome project ended, people have wondered why we have so few protein coding genes (around 20,000 at last count). The humble E. Coli contains 4300 [see: Nature vol. 385 p. 472, 1997]. Not to worry, we make lots of different proteins from the same gene, by using different combinations of exons – some exons are skipped by the spliceosome when it removes introns. Different tissues (or different states of the same tissue) skip different exons depending on (as yet obscure) conditions, so lots of different variants of the same protein are made. The process is called alternative splicing and is quite common – it happens in 92-94% of human protein genes according to a recent paper [see: Nature vol. 456 pp. 470-476, 2008 and here].

What determines which exons are left in the final product and which are skipped? This is where it gets really interesting. There exist stretches of DNA called exonic splicing enhancers (ESEs) and other stretches inhibiting the splicing in of a particular exon – the exonic splicing inhibitors (ESIs). Where are the ESEs and ESIs found? In the exons themselves.

So what? And what does this have to do with synonymous codons? The commonest genetic disease of Caucasians is cystic fibrosis (CF). Using the 12th exon of CFTR (the gene mutated in CF), when one synonymous codon was switched to another, 25% of the time it resulted in skipping of exon 12 and a defective protein [see: Proc. Natl Acad. Sci. USA vol. 102 pp. 6368-6372, 2005]. So synonymous codons aren't synonymous at all. A completely different cellular use of synonymous codons will follow in the next post, but why should chemists be interested in any of this?

Because DNA isn't sitting there passively waiting to be read in just one way. All sorts of new chemistry is involved. There is not enough space in this post for the next two examples, but their chemistry does not involve protein-DNA interaction.

So even if we had 15 amino acids and a stop codon to begin with (as per the last post) we could never give up that extra position and all that redundancy now. We need the coding overkill because it is being used for other things. This work also has profound implications for our understanding of protein evolution. That's also for next time.

December 16, 2008

They'd none of them be missed – why 20 amino acids and not 15?

Posted on behalf of Retread

Making DNA is metabolically expensive. 4 ATPs are consumed making adenine (and that's even when you start with 5 phosphoribosyl alpha pyrophosphate – PPRP). This is why parasites living inside cells have such small genomes. As soon as they figure out a way to get the host to do their metabolic work, they jettison the (now redundant) DNA. The leprosy organism which lives inside cells sheathing nerve processes has only two-thirds the DNA of its cousin, the tuberculosis organism. There are many similar examples and not all are bacterial.

As you know, 'the' genetic code is made of nucleotides which come in four varieties (abbreviated A, T, G, C). There are 16 possible combinations when nucleotides are taken 2 at a time, 64 combinations taken 3 at a time. 64 combinations is clearly is overkill for just 20 amino acids. So most amino acids have multiple combinations of 3 nucleotides (called codons) which code for them – these are the synonymous codons. Two amino acids (leucine, arginine) have 6 synonymous codons, 2 have none (e.g., just one codon – methionine and tryptophan), the rest fall inbetween.

If proteins contained only 15 amino acids, you could cut genome size by one-third – that's 4 billion or so ATPs/cell if the 3 other nucleotides are as expensive to make as adenine. As the late senator Dirksen used to say – a billion here, a billion there, pretty soon you're talking real money (this was in a older, happier pre-bailout time).

Why 15 and not 16 amino acids? Because you need a codon to tell the machinery when to stop – such codons were known as 'nonsense', back in the day when all the genome was thought to do was code for protein.

Look at the side chains of the 20 amino acids with your chemist's eye. Some are so similar as to be redundant. Glutamic acid and aspartic acid are chemically the same, differing only by a methylene group – get rid of one. Glutamine and asparagine are just the amides of the two acids (why they aren't called glutamide and asparamide is beyond me). Get rid of one of them. Similarly threonine and serine differ only by an extra methyl group. Not only that but the several hundred different enzymes which add phosphate to them (inappropriately called kinases) don't bother to tell them apart – get rid of one. Do we really need 4 different hydrocarbon side chains (methyl, isopropyl, sec-butyl, isobutyl)? Maddeningly sec-butyl belongs to isoleucine, and isobutyl belongs to leucine. Get rid of two of them – probably a long one and a short one. Other chemists might choose different amino acids to let go.

Removing these 5 amino acids from the total cuts the DNA required to code for them down by one-third, saving all that synthetic ATP. Of course, synonymous codons disappear in the process. Nonetheless, we should be able to build pretty decent proteins from the 15 amino acids we have left. No chemical functionality present in the original 20 has been lost.

Clearly this hasn't happened in the real world. Just why not is probably a matter of history, and an endless source of armchair speculation (like this post). Could there be a reason for all this coding redundancy, or at least could there be mechanisms to keep it in place?

I think such mechanisms exist, but you'll have to give up the protein-centric notion that all DNA does is code for protein. Even better, there is excellent recent hard experimental data to back this up. But that's the subject of the next post.

October 30, 2008

Chemiotics: Sherlock Holmes and the Green Fluorescent Protein

Posted on behalf of Retread

Gregory (Scotland Yard): "Is there any other point to which you would wish to draw my attention?"
Holmes: "To the curious incident of the dog in the night-time."
Gregory: "The dog did nothing in the night-time."
Holmes: "That was the curious incident."

The chromophore of green fluorescent protein (GFP) is para-hydroxybenzylidene imidazolinone. It is formed by cyclization of a serine (#65) tyrosine (#66) glycine (#67) sequential tripeptide. It is found in the center of a beta barrel formed by the 238 amino acids of GFP.

What is so curious about this?

Simply put, why don't things like this happen all the time? Perhaps nothing quite this fancy, but on a more plebeian level consider this: of the twenty amino acids, 2 are carboxylic acids, 2 are amides, 1 is an amine, 3 are alcohols and one is a thiol. One might expect esters, amides, thioesters and sulfides to be formed deep inside proteins. Why deep inside? On the surface of the protein, there is water at 55 molar around to hydrolyze them purely by the law of mass action (releasing about 10 kJ/Avogadro's number per bond in the process). Some water is present in the X-ray crystallographic structure of proteins, but nothing this concentrated.

The presence of 55 M water bathing the protein surface leads to an even more curious incident, namely why proteins exist at all given that amide hydrolysis is exothermic (as well as entropically favorable). Perhaps this is why proteins contain so many alpha helices and beta sheets -- as well as functioning as structural elements they may also serve to hide the amides from water by hydrogen bonding them to each other. Along this line, could this be why the hydrophilic side chains of proteins (arginine, lysine, the acids and the amides) are rather bulky? Perhaps they also function to sterically shield the adjacent amides. After all, why should lysine have 4 methylene groups rather than just one or two?

Now the serine-tyrosine-glycine tripeptide should occur by chance once in every 8000 tripeptides. The SwissProt database of proteins contains 144,041,553 amino acids in 399,749 proteins as of 14 October. Does this tripeptide occur 18,805 times in the database as it should? If it doesn't, is negative selection preventing it? If it does occur this often, have we missed other chromophores? Are there other tripeptides missing from SwissProt? If there are, does this tell us how to build other chromophores? Or does it tell us something important about protein structure?

I don't have the skills to properly interrogate SwissProt or the Protein Data Bank, but I imagine that some of the readership does. Go to it. These are curious incidents indeed.

September 30, 2008

Chemiotics: Auditing P-Chem

Posted on behalf of Retread

Why would an ex-organic chemist, retired MD do that? The P-chem you need for organic chemistry is pretty simple. You can look at most reactions and figure the overall entropy and enthalpy, and we get pretty good at figuring out delta-deltaG and manipulating it to get reactions to go the way we want.

Well, the answer is because nearly all the really interesting questions in cellular biology involve physical chemistry. Look back at the post of 20 March where throwing a growth factor at a cell resulted in a two fold change in phosphorylation in 924 of 6,600 phosphorylatable sites in 2,244 different proteins. We have some 478 enzymes (called kinases) to accomplish this reaction. Why so many? Because most kinases have a limited number of substrates. Studying the phosphorylation reaction itself (e.g. the classic chemistry) tells you very little. What determines which kinase associates with which substrate? That's exactly where physical chemistry comes in. The association of one protein with another doesn't involve covalent (or even ionic) bonds. It's mostly van der Waals and hydrogen bonding, along with solvent effects. Pure P-Chem.

Non(classical chemical) bonding protein association is crucial in the normal life of the cell (and sometimes in its death). Consider the mediator complex. It is required for the molecular machines which transcribe DNA into RNA (the three RNA polymerases) to actually do their work. Depending on the organism, the mediator complex has between 20 and 30 proteins and a mass of 1-2 megaDaltons. Also, RNA polymerase II itself isn't just one protein, but 12 (in yeast) with a mass of 500 kiloDaltons — again held together by noncovalent interactions.

A personal reason for studying P-Chem is the protein folding problem, where nary a covalent bond is formed. I'd certainly like to get up to speed to read the literature and find out if the 'potential energy funnel' is more than a fancy way to say that (biologic) proteins fold into their final shape quickly. As docs, we do this all the time. Consider the diagnosis of idiopathic thrombocytopenic purpura. Impressive, n'est-ce pas? However, all it means is that you are bleeding because you don't have enough platelets (a type of blood component) and we don't know why.

We've already been through the 3 laws of thermodynamics, the second introduced by Carnot's brilliant analysis of the changes in state of an ideal gas as it went around his cycle, and his discovery (better construction) of the concept of entropy. Even after nearly 200 years, the power of his thought is impressive. I doubt that most of you have the time, but you will be similarly impressed with the stunning power of Darwin's mind if you read "The Origin of Species". All of you have more background (just by inhaling the zeitgeist) than he did. If you really have a lot of time, read "Darwin's Ghost" by Steve Jones along with Darwin. Jones updates "The Origin .. " to 2000 chapter by chapter. Although Jones is an excellent writer, Darwin wins each chapter hands down.

Finally, the course is being given at the local state university. It's very gratifying to see that state universities continue to function as the giant engines of social mobility that they were for my parents' generation, educating immigrants and the children of immigrants. The present crop of students isn't predominantly from eastern and southern Europe as my father's class was at Rutgers 80 years ago. But immigrants they are, and 3 of the students I've spoken to were born in Nigeria, Haiti and Poland.

September 03, 2008

Chemiotics: Apologies to Borodin

Posted on behalf of Retread

Can you picture yourself spending a week with a group of people who can't tell an Angstrom from arugula, some of whom are wary of all "chemicals". Many highly analytic types (mathematicians, computer scientists, physicists, electrical engineers and even chemists) do just that and enjoy it immensely. I speak of adult amateur chamber music festivals (or 'band camp for adults' as one of my friend's grandkids calls them). After 35 years of them, I only met the 5th chemist this year. They are vastly outnumbered by the other analytics, particularly mathematicians and physicists.

Participants are highly educated for the most part, but the most talented cellist this year was a moving-company man who hauls furniture around for a living, and I still remember playing with a marvellous 300-pound violist years ago who was a jail matron.

If you were an aspiring organic chemist in the early 60s, the bible was "Mechanism and Structure in Organic Chemistry" by Edwin S. Gould, a physical chemist amazingly enough. He also happens to be an excellent violinist and I had the pleasure of playing with him a few years ago. He's still active in research although he received his PhD from UCLA in 1950. Who says chemicals are toxic!?

Occasionally the two cultures do clash, and a polymer chemist friend is driven to distraction by a gentle soul who is quite certain that "chemicals" are a very bad thing. For the most part, everyone gets along. Despite the very different mindsets, all of us became very interested in music early on, long before any academic or life choices were made.

So, are the analytic types soulless automatons producing mechanically perfect music which is emotionally dead? Are the touchy-feely types sloppy technically and histrionic musically? A double-blind study would be possible, but I think both groups play pretty much the same (less well than we'd all like, but with the same spirit and love of music).

I wonder why chemists are so outnumbered in this group? It's been downhill ever since Alexander Borodin. Perhaps a larger sample is needed. Any thoughts?

July 16, 2008

Chemiotics: Unrequired reading

Posted on behalf of Retread

If you look back at your notes on thermodynamics, you are likely to find a blizzard of partial derivatives, state functions and total differentials. As an organic chemist, I had an intuitive understanding of the thermodynamics I needed at the molecular level (actually it's pretty simple), but the math and the big ideas were not friends. Should you be in the same boat and wish to get the big picture, have a look at "Four Laws that Drive the Universe" by Peter Atkins. It's 124 small pages, written extremely well and bounces back and forth between the macroscopic and the microscopic illuminating each by the other. If there is a derivative to be found, I missed it.

The book may produce in you physics envy (with apologies to Freud). On p. 45 you will find a discussion of Noether's theorem, which states that under all the conservation laws of physics lies a symmetry. The first law (conservation of energy) is really about the symmetry of time flow — e.g., "time flows steadily, it does not bunch up and run faster then spread out and run slowly." Chemistry just doesn't have statements of such majesty (or strangeness).

If you liked Atkins you'll love "Boltzmann's Atom" by David Lindley. It concerns Boltzmann's trials and tribulations as he developed statistical mechanics. As a neurologist I doubt that they drove him to suicide at 62 (he sounds pretty loosely wrapped throughout his life). Boltzmann's big opponent was Ernst Mach, who didn't see the need for atoms as an explanatory device. Mach's view was that physics should establish laws tying observable phenomena together — e.g., the ideal gas law etc, etc... Postulating something you couldn't see to explain something you could, was not considered science (by Mach and his followers). Pretty strange to our way of thinking today, but these were the events of just over 100 years ago.

However, vestiges of Mach’s thinking linger on in the Copenhagen interpretation of quantum mechanics. As junior chemistry majors in the 50s we had to read "The Logic of Modern Physics" written by P. W. Bridgeman in 1927. It was our introduction to quantum mechanics (as none of us had the math to tackle it). All you could hope to predict by a theory were 'numbers on a dial'. Going deeper, by hoping for a trajectory explaining things was a no no (the nodes in atomic and molecular orbitals pretty much rule out trajectories don't they?). The book drove us nuts at the time, as chemistry back then was firmly on the macroscopic side of the quantum mechanical divide.

Gibbs and Maxwell make their appearance in Lindley's book, as does the culture and politics of Austria-Hungary before WWI, so there is some breathing room for the reader. One of the founders of physical chemistry, Wilhelm Ostwald, also appears. He doesn't come off too well — he was enamored of something called energetics, which to Boltzmann (and to Lindley who is a physicist) meant that he really didn't understand physics very well.

Atoms were finally accepted after Einstein's work on Brownian motion in 1905 (also described). Parenthetically, there was a similar controversy ending about the same time, as to whether the brain was made of cells, and whether individual neurons existed, or whether the whole brain was a big gemish of nuclei and fibers.

June 17, 2008

Chemiotics: A chemical double entendre

Posted on behalf of Retread

A few chemists who were both literary and literal recently looked fairly silly on the pages of the New York Times and in the blogosphere. You can read all about it on Michelle's Francl's blog "Culture of Chemistry". See her post of 1 May '08 — "How to tell if you're really a chemist." To make a long story short, 3 chemically impossible organic molecules (5 bonds to carbon etc., etc...) spelled out the word SEX in a review of a book about (what else?) sex. The chemists missed the semantic forest while closely inspecting the chemical trees.

Is there anything inside the cell being read chemically two different ways? Yes there is, and it has implications for how we determine what in the genome is being worked on by natural selection and what is being left alone. If intron, exon, neutral selection and synonymous and nonsynonymous codons aren't old friends, have a look at the first comment which will give you all the background you need (which is quite a bit).

People attempt to measure the rate of natural selection acting on proteins using synonymous and nonsynonymous codons in the same protein in different organisms (say hemoglobin for example). Positive selection is measured as the rate of nonsynonymous nucleotide substitution (Ka) per nonsynonymous site, relative to the underlying 'neutral mutation rate', which is given by the rate of synonymous substitution per synonymous site (Ks). Usually Ka is much less than Ks (as most new mutations aren't helpful or are actually harmful — this is negative selection). Positive selection is implied by Ka/Ks greater than 1. However, strictly by chance, the ratio of nonsynonymous (Ka) to synonymous (Ks) amino acid substitutions is 2:1.

All very nice, but ESSs and ESEs are found in exons, and mutations of them will change alternate splicing (something a functioning cell has a great interest in). It's easy to see how changing one nucleotide in an ESS or an ESE could render it more or less effective, while leaving the amino acid sequence of the underlying protein unchanged. In short, the 'neutral mutation rate' may in fact not be neutral at all (if it is in an ESE or an ESS). Or possibly switching one amino acid for another has nothing whatever to with the protein and everything to do with controlling alternate splicing.

Now, chemists are adept at doing all sorts of different things with the same structure. Think what organic chemists can do with a carbonyl group. But whatever they do is over and done with. In protein-coding genes, the same sequence can mean two different things without being chemically changed at all.

We are far from understanding all the things DNA can do in a cell. Less than 2% of our 3.2 gigabases of DNA codes for exons. Calling the 98% of the genome not doing so 'junk' is a vestige of the protein-centric era of molecular biology, just as calling changing one synonymous codon for another neutral. Both assume that the only thing that DNA does is code for protein.

The expressive power of language lies in its ambiguity not its precision. DNA may be similar as we uncover the languages it speaks. My guess is that there are more to be found.

June 02, 2008

Chemiotics: A chemical gedanken experiment

Posted on behalf of Retread

In the early days of quantum mechanics Einstein and Bohr threw thought experiments (gedanken experiments) at each other like teenagers throwing firecrackers. None were thought possible at the time, although thanks to Bell and Aspect, quantum nonlocality and entanglement now have a solid experimental basis.

Two Chemiotics posts ago there appeared the following: “I doubt that most strings of amino acids have a dominant shape (e.g., biological meaning), and even if they did, they couldn't find it quickly enough (the Levinthal paradox again).”

How would you prove me wrong? The same way you'd prove a pair of dice was loaded. Just make (using solid-phase protein synthesis) a bunch of random strings of amino acids (say 41 amino acids long) and see how many have a dominant shape. If one crystallizes it does, if not, use NMR to look at them in solution. You can't make all of them, because the earth doesn't have enough mass to do so (see “How many proteins can we make?” a few posts back). That's why this is a gedanken experiment — it can't possibly be performed in toto.

Even so, the experiment is over (and I’m wrong) if even 1% of the proteins you make have a dominant shape.

However, choosing a random string of amino acids is far from trivial. Some amino acids appear more frequently than others depending on the protein. Proteins are definitely not a random collection of amino acids. Consider collagen. In its various forms (there are over 20, coded for by at least 30 distinct genes) collagen accounts for 25% of body protein. Statistically, each of the 20 amino acids should account for 5% of the protein, yet one amino acid (glycine) accounts for 30% and proline another 15%. Even knowing this, the statistical chance of producing 300 copies in a row of glycine–any amino acid–any amino acid by random distribution of the glycines are less than zilch. But one type of bovine collagen protein has >300 such copies in its 1042 amino acids.

One further example. If you were picking out a series of letters randomly hoping to form a word, you would not expect a series of 10 ‘a’s to show up. But we normally contain many such proteins, and for some reason too many copies of the repeated amino acid produce some of the neurological diseases I (ineffectually) battled as a physician. Normal people have 11 to 34 glutamines in a row in a huge (molecular mass 384 kiloDaltons — that’s over 3000 amino acids) protein known as huntingtin. In those unfortunate individuals with Huntington’s chorea, the number of repeats expands to over 40. One of Max Perutz’s last papers [Proc. Natl. Acad. Sci. USA 99, 5591–5595 (2002)] tried to figure out why this was so harmful.

On to the actual experiment. Suppose you had made 1,000,000 distinct random sequence proteins containing 41 amino acids and none of them had a dominant shape. This proves/disproves nothing. 10^6 is fewer than the possibilities inherent in a string of 5 amino acids, and you've only explored 10^6/(20^41) of the possibilities.

Would Karl Popper, philosopher of science, even allow the question of how commonly proteins have a dominant shape to be called scientific? Much of what I know about Popper comes from a fascinating book “Wittgenstein's Poker” and it isn't pleasant. Questions not resolvable by experiment fall outside Popper's canon of questions scientific. The gedanken experiment described can resolve the question one way, but not the other. In this respect it’s like the halting problem in computer science (there is no general rule to tell if a program will terminate).

Would Ludwig Wittgenstein, uberphilosopher, think the question philosophical? Probably not. His major work “Tractatus Logico-Philosophicus” concludes with “What we cannot speak of we must pass over in silence”. While he’s the uberphilosopher he’s also the antiscientist. It’s exactly what we don't know which leads to the juiciest speculation and most creative experiments in any field of science. That's what I loved about organic chemistry years ago (and now). It is nearly always possible to design a molecule from scratch to test an idea. There was no reason to make [7]paracyclophane, other than to get up close and personal with the ring current.

If the probability or improbability of our existence, to which the gedanken experiment speaks, isn’t a philosophical question, what is?

May 15, 2008

Chemiotics: Do you know where your drug is (and what it is doing)?

Posted on behalf of Retread

Reading the biomedical literature is like reading a large Russian novel with thousands of characters who interact in unexpected ways. A recent paper [Nature Medicine vol. 14 pp. 382 -391 (2008)] brings together 3 such actors — CFTR, the protein mutated in cystic fibrosis, ceramide, a molecule only of interest to neurologists until recently, and amitriptyline, a drug for depression whose mechanism of action was (seemingly) known.

Let’s start with CFTR, a huge protein (1480 amino acids). CFTR mutations cause cystic fibrosis, the commonest hereditary disease of Caucasians. There must be some selective advantage to CFTR mutations as over 600 were known as of 2003. However just one accounts for >50% of all cases. It is a deletion of phenylalanine at position #508 (showing just how delicate protein structure and function really is). One guess is that the mutants protect against intestinal pathogens (infantile diarrhea kills many children in the developing world).

Ceramide and its derivatives contain two saturated unbranched hydrocarbon chains (16–20 carbons long). They are found in myelin (the wrapping of nerve fibers) which is mostly lipid. All sorts of awful hereditary neurological diseases (usually affecting children, but fortunately rare) are caused by the accumulation of molecules containing ceramide. In recent years, ceramide's effects on non-neuronal cell proliferation and/or cell death have become prominent. Ceramide is a second messenger. The intracellular effects of ceramide in the normal workings of the brain haven't been much studied.

Amitriptyline (Elavil) was one of the earliest antidepressants. We all knew how it worked; by blocking the re-uptake of neurotransmitters such as serotonin and norepinephrine from the synapse (except that this is an acute effect and this class of drugs — the tricyclic antidepressants — takes a few weeks to work).

Surely you see how all this fits together at this point. No? I didn't either. Read on...

It turns out that CFTR mutations increase the levels of ceramide inside the lungs (the primary site of infection in cystic fibrosis). This is caused by alkalinization of the intracellular sites where ceramide is broken down. Elevated ceramide levels are thought to increase cell death, resulting in lung infection (the bacteria have more to munch on).

Where does amitriptyline fit in? It lowers lung ceramide levels. How? By decreasing the amount and/or the activity of an enzyme (acid sphingomyelinase) which breaks down a precursor of ceramide. The paper is silent on the mechanism(s) by which this happens (but does give two references #24, #25). Treating transgenic mice with mutant CFTR with amitriptyline decreases the frequency and severity of their lung infections. Amazing.

Where does the effect of amitriptyline on neurotransmitter re-uptake fit into all of this? It doesn't, and that's just the point.

Nowadays, medicinal chemists design organic molecules to fit into slots of proteins whose function they are trying to alter. The tricyclic antidepressants weren't discovered this way (they are much older), but papers like Mol. Pharmacol. vol. 50 pp. 957–965 (1996) found crucial amino acids in the re-uptake protein to which they bound. A fairly open and shut case for their mechanism of action.

Except it isn't. Who knows how many designer drugs are really working the way we think they do. A cautionary tale indeed...

May 07, 2008

Chemiotics: Why should a (biological) protein have one shape?

Posted on behalf of Retread

Back in the 80s when artificial intelligence (AI) was going to make humans obsolete, LISP was the programming language of choice for AI. As a neurologist I was interested in intelligence in any form (machine or otherwise) so I tried to learn it. Most programs looked like gibberish. There was a great quote in a book "Let's Talk LISP" after a particularly convoluted piece of code — "Relax you, never understand anything, you just get used to it".

I think the same thing has happened with our understanding of biologically relevant proteins. We've just become used to the fact that biological proteins have a dominant shape. However, we also know that other polymers don't. DNA and RNA certainly don't have a single shape.

So why do biologically meaningful proteins have one? Consider enzymes. The amino acid side chains comprising the active site are found all over the protein rather than next to each other in the sequence. Chymotrypsin, one of the best studied enzymes, has a catalytic triad made from histidine #57, aspartic acid #102 and serine #195. To function, they must be brought near to each other and held there fixed (and in the proper orientation to boot). The same holds for structural proteins that make up muscle and the cytoskeleton.

Yet only 10 kcal/mole — 2 hydrogen bonds — is enough to denature them. Not much of an activation energy — not even close to a covalent bond. Once denatured, Anfinsen showed that ribonuclease found its way back to the original shape, implying that there were no other conformations of similarly low energy available to it.

It is remarkable that we only have 20,000 or so protein coding genes when you consider just how large possible protein space is. In this regard, proteins are like English words. There are very few of them when you calculate how many there could be. Sonnet #18 — "Shall I compare thee to a summer's day?" contains 114 words of which 17 are 7 or more letters long. The Oxford English dictionary contains 600,000 or so words of all lengths. There are 8 x 10^9 strings of 7 letters. Few of them have meaning.

Words are a lot shorter than proteins. There are 8 times as many strings of 4 amino acids (20^4 = 160,000) than we have proteins. My guess is that this isn't an accident, because I doubt that most strings of amino acids have a dominant shape (e.g., biological meaning), and even if they did, they couldn't find it quickly enough (the Levinthal paradox again).

How would you prove me wrong? Is the question even meaningful scientifically? I (of course) think it is quite meaningful in a philosophic sense, since it bears on just how probable or improbable life is. The next post will discuss some gedanken experiments which could settle the question (or show that it is unanswerable).

April 23, 2008

Chemiotics: Why should a protein have one shape?

Posted on behalf of Retread

Well of course they don't, but the proteins we know the most about (because they can be crystallized and their structure determined by X-ray diffraction) do have a shape. Sperm whale myoglobin, the first protein to have its 3-dimensional structure determined, showed that this couldn't be the whole story. Sperm whales (air breathing mammals after all) use their myoglobin to carry oxygen during their hour-long dives down to 1000 meters. Kendrew and Perutz's crystal structure showed no way for oxygen to find its way in to the embedded porphyrin ring. Amazingly, the 153 amino acids of myoglobin must themselves breathe to let the oxygen in.

All it takes to denature (seriously change its tertiary structure so it is no longer functional) a protein of 100 amino acids is 10 kcal/mole (Voet & Voet - Biochemistry 3rd Edition p. 258). That's two hydrogen bonds - not much.

Sight your eye at the alpha carbon of one of the amino acids of this protein, looking toward the carbonyl carbon. There are three conformational energy minima the carbonyl can adopt. That's potentially 3^99 = 10^48 conformations (clearly an overestimate because of self intersection, but still, a huge number). Yet to be crystallizable, this protein must choose just one of them, and it must be lower in energy by 2 hydrogen bonds than all the rest.

In addition, to get to this single structure, the protein can't possibly sample all the conformations available to it. The rotation barrier of ethane is 12 kJ/mole and a barrier of 73 kJ/mole allows a rotation rate of 1 per second, and every 6 kJ changes the barrier by a factor of 10 at 25 deg C (Clayden et al. Organic Chemistry pp. 450-1). So the maximum rate of rotation of ethane is 10^11 per second (at a body temperature of around 37 deg C) rather than 10^10 at 25 deg C. This is clearly an upper bound on the rotation rate as the mass attached to the alpha carbons of a protein will make the rotation far slower, but let it pass (that's why I chose ethane in the first place). That's 10^37 seconds to sample the conformations available, far longer than the age of the universe. This is the Levinthal paradox.

So for the crystallizable proteins (all of biological interest so far) one conformation out of all those available must be more stable (but only by two hydrogen bonds) than all the rest, and the particular conformation must be findable quickly (or we'd all be dead).

How likely is this for a 'random' sequence of amino acids. We'll probably never know (but we might if we're lucky). This is the subject of the next post...

April 15, 2008

Chemiotics: How many proteins can we make?

Posted on behalf of Retread

The mass of the earth is given by my physics book (Halliday 6th Ed.) as 6 x 10^27 grams. If we made just one molecule of each protein containing n amino acids linked together, when would we run out of material? Make a guess. I found the results surprising.

Assume the earth is made of nothing but hydrogen, oxygen, nitrogen, carbon and sulfur. Clearly not true, but we're going for what mathematicians call an upper bound. If mathematicians can get away with things like "consider a spherical cow" I can get away with this. (The cognoscenti may wish to go for a least upper bound). Proteins are linear chains of 20 different amino acids ranging in mass from glycine at 79 Daltons to tryptophan at 204. When linked together by an amide (peptide) bond, 18 Daltons of mass is lost (water is split out). So figure the average amino acid at 100 Daltons (roughly).

So there are 20 x 20 = 400 distinct proteins of 2 amino acids, 8000 with 3, 160,000 with 4, 3,200,000 with just 5. Shorties like this are called peptides (or polypeptides) and just when you start calling them proteins seems to be a matter of taste.

We're figuring the mass of the typical amino acid at 100 Daltons, but a Dalton doesn't have much mass. It is 1/12 the mass of a single atom of carbon-12, Avogadro's number (about 6 x 10^23) of which have a mass of 12 grams. So one Dalton has a mass of 10^-24 grams (roughly).

The number of distinct proteins containing n amino acids is 20^n. The mass of each protein (in Daltons) is (roughly) 100 x n — depending on the amino acids chosen. The mass of the collection of distinct proteins of length n in grams is (20^n) x (100 x n) x (10^-24). It's clear that we're over 1 gram for the collection at only 24 amino acids (as 20^24 is much larger than 10^-24. How far over? 2^24 x 100 x 24 = 40,265,318,400 = 4 x 10^10 grams.

As noted, the mass of the earth is 6 x 10^27 grams. So we're not too far away at 24 amino acids. Certainly no farther away than another 17 amino acids as 20^17 is much greater than 10^17.

So, the mass of the earth (which isn't all carbon, hydrogen, etc... ) isn't enough to make just one molecule of each of the possible proteins 41 amino acids long. 41 amino acids is a very small protein (some would call it a polypeptide). Just about every protein of biological interest is much larger. The champ is a muscle protein called titin which has 27,000+ amino acids.

So what? It means that chemists will never be able to explore more than a tiny morsel of the space of possible proteins. Perhaps computationally we will (I doubt it), but that's the subject of a future post.

April 03, 2008

Chemiotics: Causality in the cell and how puppies give us hope

Posted on behalf of Retread

This post is pretty philosophic, but it discusses some issues raised by the previous post that shouldn't be ignored. Future posts will be far more chemical.

Do we have any hope of constructing a nice chain of causality for what happens when we throw epidermal growth factor (EGF) at a Hela cell (described in the last post) — e.g., the EGF receptor activates kinases 1 through N, each of which phosphorylates substrates (some of which are other kinases) which eventually phosphorylate the 924 sites on the 2,244 proteins (and in the correct temporal order to boot). I don't think there are enough researchers to do it, or labs to hold them. Even worse, if the results were available, I don't think our minds are strong enough to grasp them.

A big stumbling block would be the multiple degrees of feedback present even with something as simple as phosphorylation and dephosphorylation. Simple ideas of causality and control vanish with feedback (see two posts back — "The Decline of Master Gland..."). Causality is inherently a linear, sequential idea. Even chaos theory is basically causal, although predictability goes out the window.

That's not to say our brains don't do incredibly complex things such as just recognizing people. You never see anyone at exactly the same angle, under the same light, with the same background. People are usually moving, attired differently, etc., etc... Yet our brains in some way compute an invariant that computer science can only dream about permitting instant recognition. As people move about and you interact with them, zillions of new sensory inputs must be absorbed, transformed and matched to the same invariant. Since we do all this unconsciously, we don't think anything of it.

Yet we don't do very well predicting events where feedback is involved (like the stock market where most people lose). Perhaps the next step up in human intelligence is the ability to perceive the various forms of multiloop feedback, the way we recognize faces and people.

Could our brains change that fast? Possibly. Consider the man's best friend vs. the chimp. [Science vol. 298 pp. 1634–1636 (2002)] Chimpanzees are terrible at picking up human cues as to where food is hidden, even when the cues are as obvious as pointing to the food container. Even chimps that eventually perform well, take dozens of trials or more to learn what the cues mean. I find this surprising.

However, puppies (raised with no contact with humans) do much better at this than chimps — anyone owning a dog knows they can read us like a book. Wolf cubs don't do better than the chimps, even cubs raised by humans. This implies that during the process of domestication, dogs have been selected for a set of cognitive and social abilities that allow them to communicate with us. Domestication has only gone on for 10,000–15,000 years (the dawn of agriculture). I find it absolutely incredible that we could have changed the dog's brain in what is basically a microsecond in evolutionary time. Yet we did. Hopefully our brains are as plastic.

Not to be too depressed by this. There clearly are single chains of causality in the cell and chokepoints which we can find and modify. Consider Gleevec. Success stories like this provide employment for legions of chemists.

March 20, 2008

Chemiotics: The vanishing simplicity of chemical pathways in the cell

Posted on behalf of Retread

So nat'ralists observe, a flea
Hath smaller fleas that on him prey,
And these have smaller fleas that bite 'em,
And so proceed ad infinitum.

– Jonathan Swift

Is anything like this going on in the cell? Consider mitogen activated protein kinase kinase kinase (abbreviated MAPKKK) — shades of Major Major Major in Catch-22. Recall that a kinase is an enzyme which attaches a phosphate group to (phosphorylates) one of the 3 amino acids with hydroxyls on their side chains — serine, threonine and tyrosine. A phosphate ester is formed in the process adding a significant amount of negative charge and some local bulk to the protein (and if the protein is an enzyme often significantly altering its activity).

And what is the target that MAPKKK phosphorylates? Why MAPKK, another kinase which itself phosphorylates MAPK (yet another kinase — I'm not making this up). MAPK phosphorylates a variety of proteins, among them transcription factors which turn on various genes.

All quite linear (sequential) and comprehensible. There is a nice chain of causality from the agent outside the cell (the mitogen) to the receptor for it, to MAPKKK and so on to a particular set of genes whose level of expression is altered with the net result being cellular proliferation (e.g., mitosis).

Discovering this pathway took a lot of hard work on the ras protein, which is mutated in 30% of all cancers. Just the steps from the mitogen binding to its receptor to ras and thence to MAPKKK are quite complex. It was a hard slog, one (linear) step at a time. But what if all this work was like the drunk looking under the street light for his key because that's where the light was. Suppose far more than that is going on.

Instead of teasing out pathways one protein at a time, suppose you just threw a mitogen (in this case epidermal growth factor — EGF ) at a cell (OK, a cancer cell — the Hela cell — the workhorse of cancer research) and looked at every protein to see what was phosphorylated and what was not. Using advanced mass spectroscopy and some other cutting edge techniques [Cell vol. 127 pp. 635–648, 2006] did just that. Some 6,600 distinct phosphorylation sites on 2,244 different proteins were found. 924/6,600 sites showed more than a twofold change in the phosphorylated to unphosphorylated ratio.

In addition, the work was repeated at several time points within 30 minutes of EGF application, allowing the time course of phosphorylation at each site to be determined. The time courses of phosphorylation varied from site to site. Many proteins had more than one site phosphorylation. Even on the same protein the time course of phosphorylation depended on the site studied. At least 46 distinct regulators of gene transcription showed a greater than twofold variation in phosphorylation. It doesn't take much imagination to see that adding a lot of negative charge would alter the ability of a transcription factor to approach DNA (which has one phosphate per nucleotide).

Where this leaves our notion of causality (which really is quite linear) and whether our minds are strong enough to comprehend these events is the subject of the next post.

Retread

March 12, 2008

Chemiotics: The decline of the master gland and the rise of feedback

Posted on behalf of Retread

Endocrinology was pretty simple in med school back in the 60s. All the target endocrine glands (ovary, adrenal, thyroid, etc.) were controlled by the pituitary; a gland about the size of a marble sitting an inch or so directly behind the bridge of your nose. The pituitary released a variety of hormones into the blood (one or more for each target gland) telling the target glands to secrete, and secrete they did. The master gland ruled.

Things became a bit more complicated when it was found that a small (4 grams or so out of 1500) part of the brain called the hypothalamus sitting just above the pituitary was really in control, telling the pituitary what and when to secrete. Subsequently it was found that the hormones secreted by the target glands (ovary, etc.) were getting into the hypothalamus and altering its effects on the pituitary. Estrogen is one example. Any notion of simple control vanished into an ambiguous miasma of setpoints, influences and equilibria. Goodbye linearity and simple notions of causation.

As soon as feedback (or simultaneous influence) enters the picture it becomes like the three body problem in physics, where 3 objects influence each other's motion at the same time by the gravitational force. As John Gribbin (former science writer at Nature and now prolific author) said in his book ‘Deep Simplicity’, "It's important to appreciate, though, that the lack of solutions to the three-body problem is not caused by our human deficiencies as mathematicians; it is built into the laws of mathematics." The physics problem is actually much easier than endocrinology, because we know the exact strength and form of the gravitational force.

Organic chemists dearly love linearity. Nothing is more linear and causal than a multistep synthesis. We always search for conditions producing just what we want in high yield with as few unwanted products as possible, thank you. Le Chatelier's principle is used again and again to force reactions to go just the way we want. It is a type of thinking that will not help us understand what is going on within our cells.

At one time it was thought that we had about 100,000 genes coding for proteins. The best current estimates are around 20,000. These genes code for structural proteins (like those of muscle and bone) and enzymes which do things like metabolize sugar or build the components of structural proteins (amino acids) or of DNA and RNA (nucleotides). We are gradually finding out that a lot of our genes function as controlling elements.

For instance, we have 478 genes for enzymes called kinases which form phosphate esters on the hydroxyls of threonine, serine and tyrosine of proteins, radically altering their function usually (the phosphate group adds a lot of negative charge). We have 107 genes for enzymes (called phosphatases) just for removing the phosphate from tyrosine (never mind serine and threonine). Another 600 or so genes code for enzymes which add (or remove) a small protein called ubiquitin from other proteins. Again feedback, control and nonlinearity.

Where this leaves the notion of causality in the cell, and worse, our ability to comprehend it -- we do think linearly after all -- will be the subject of the next post.

Retread

March 05, 2008

Chemiotics: Is math harder than organic chemistry?

Posted on behalf of Retread

The Scandinavian Goddess I had a crush on all through high school could pick up any instrument and play it — piano, clarinet, guitar, saxophone, etc... She didn't think it was a big deal, it was just the way she was. The Hungarian uprising of '56 occurred while I was a freshman in college. A friend who already knew 12 or so languages picked up Hungarian in a week or two and went up to Camp Kilmer in New Jersey to act as a translator for the refugees. It was just something he could do. 50+ years later, the 16 year old high school student auditing an upper level college course in abstract algebra I was taking looked up occasionally from his German homework when the lecturer made an obscure point. He blitzed the course and later went on to college.

I don't think there is anything remotely like that in organic chemistry, although the rumor back then was that Woodward knew all of Beilstein before he hit puberty. Learning organic chemistry always seemed pretty easy and intuitive to me (even now when revisiting it years later). Perhaps it was playing with TinkerToys as a kid. I've found math much, much harder.

In organic chemistry you come to know carbon inside out and at least one atom of it is always present, so you can bring everything you already know (which is quite a bit) to the problem at hand. Math isn't like that at all. You are always bumping up against new definitions, concepts and theorems. Once you get past the plug and chug part of math (use the chain rule n times, integrate by parts m times to find an integral, look for a recursion formula by repeatedly differentiating) you are proving theorems. Here, you must bring everything you know about math to proving the theorem or problem at hand. You may have to create a function, a group, an ideal to solve it, reason by contradiction, think of a counterexample etc., etc...

Is anything like that in organic chemistry? Of course there is. The theorems of organic chemistry are its syntheses. Every reaction you ever heard of comes into play, new ones must be invented, mechanistic pitfalls considered, conditions carefully adjusted etc., etc... You are not asked to synthesize strychnine as a college junior but you start proving theorems in math at that point and never stop. That's why math is harder (to learn).

So math is harder to learn, but organic chemistry and math are equally hard to do. If we really understood mechanism and reactivity, we could just write out the steps and have a robot perform them. We don't because our knowledge is very incomplete. In this sense, organic synthesis is actually harder than math, because in math you are starting with a huge background of solidly proven results which are at your disposal. In chemistry you have a similarly huge background, but there is no guarantee that any of it will work on your particular problem. It's your job to figure out why something which should have worked didn't do so and a way around it as well. That's not easy at all.

Retread

February 26, 2008

Chemiotics: The unbearable weirdness of quantum mechanics (with apologies to Kundera)

Posted on behalf of Retread

Much of the training of budding neurologists in the 60s was concerned with how to perform a good neurological exam and interpret the results. Various constellations of abnormalities pointed to different regions of the nervous system and the history often told us what sort of trouble was present there. Essentially we were inferring abnormalities of structure from abnormalities of function.

Why not just look? We had only two ways to do so back then (1) sticking a needle in an artery and injecting a dye which X-rays couldn't pass through (radio-opaque dyes – if you don't already know what they are, think of what you'd want to synthesize) – this had a 1–5 % stroke rate at the time (2) injecting air via a spinal tap and taking X-rays subsequently (I'm not kidding).

The advent of computerized axial tomography (CAT scans) and MRIs (magnetic resonance imaging) changed all that. We were able to directly look at structure without a decent exam. Not only that – problems could be picked up before they produced changes in function (e.g., earlier).

Naturally, neurologists were panicked, thinking that we would soon become the buggy whip manufacturers of medicine. Somehow, telling my colleagues that MRIs showed the essential correctness of quantum mechanics didn't help, producing only blank stares and decreased referrals.

Telling the man in the street that spectroscopy alone shows the correctness of quantum mechanics (sharp absorption and emission lines show that only certain energies of molecules and transitions between them are permissible) just doesn't cut it. But everybody knows what an MRI is.

Forget the wave nature of light (for today). Think of photons as baseballs travelling at various speeds (I know, light has but one speed and its frequency determines its energy just as the speed of a baseball determines its kinetic energy). Throw the baseball at a window. If you throw it fast enough (high kinetic energy) it goes through, if you throw it slowly it doesn't. Everybody knows that.

Not so with the light used for MRI. They are radiowaves and contain around one millionth of the energy of visible light, yet they go right through our skull and brain rather than bouncing back. Why? The only way we can get them absorbed by our brains is to place ourselves in a strong magnetic field in the scanner. The magnetic field essentially creates two new energy levels so close together in energy that the tiny energy difference between them matches the energy of the radiowave permitting it to be absorbed. Without absorption, no pictures. Certainly counterintuitive, but used every day all over the world. Quantum mechanics rules (but weirdly).

Retread

February 19, 2008

Chemiotics: We had to destroy the village to save it

Posted on behalf of Retread

An incredible article appeared last month in the journal Science. If it can be verified and if it applies generally, our conception of just how genes coding for protein are turned on will be radically changed (yes, there are many other kinds of genes other than those coding for proteins). If DNA compaction, nucleosomes, histones, lysine methylation and demethylation, the histone code, nuclear hormone receptors (particularly the estrogen receptor), DNA glycosylase and topoisomerase aren't old friends have a look at the first comment on this post for the background you need. Don't worry, there is plenty of chemistry to follow.

Some histone code modifications are reversible, particularly acetylation of the epsilon amino group of lysine. Enzymes acetylating histone lysines are called histone acetylases, those removing it are called histone deacetylatases (HDACs). However, lysine methylation was thought to be permanent until '04 when several enzymes able to demethylate lysine were found. One such enzyme is called LSD1 (it has nothing to do with the hallucinogen). It removes the two methyl groups from lysine #9 of histone #3 (H3K9me2). If this modification is present on a nucleosome near a gene, the gene is silenced, so the methyls must be removed so the protein it codes for can be made.

The estrogen receptor + estrogen complex bound to the ERE (the estrogen response element – a 15 nucleotide DNA sequence) triggers H3K9me2 removal. The process of demethylation is oxidative (how else would you split a nitrogen to hydrocarbon bond?). Hydrogen peroxide is produced, a loose cannon which oxidizes the juicy electron-rich bases of DNA nearby, forming in particular 8 oxo-guanine, as guanine is the most easily oxidized DNA base. Since 21% of the DNA bases in our genome are guanine, H2O2 doesn't have far to look. This calls in some fairly heavy artillery (DNA glycosylase to remove the 8 oxo-guanine, topoisomerase IIbeta to unwind the DNA so it can be repaired, the repair enzymes, etc, etc...). Naturally this opens up the compacted DNA structure around the gene allowing RNA polymerase II to do its work transcribing the estrogen responsive gene into mRNA (once the damage is repaired).

So according to this paper, estrogen turns on gene transcription by damaging DNA. This is fantastic (if true). There's more. The estrogen receptor is but one member of a group of proteins called nuclear hormone receptors. The name comes from the fact that other hormones (progesterone, androgen, thyroid, glucocorticoids, mineralocorticoids) have their own proteins that turn on (or turn off) genes the same way. Subsequently it was found that some vitamin metabolites (vitamin D3, vitamin A) have similar receptors even though they aren't hormones. The human genome contains 48 such proteins. Less than half of them have known ligands. Those with known ligands have their finger in just about every metabolic pie in the cell.

One final point. It has been estimated that 8-oxoguanine is formed 100,000 times each day in every cell. Perhaps its formation is physiologic rather than pathologic. Where does that leave antioxidant therapy, which has been touted to do everything but cure hemorrhoids? Well, one such trial was done on 29,000 Finnish men at high risk for lung cancer (they were smokers) [New England J. Med. vol. 330 pp. 1029-1035 (1994)] Alpha tocopherol (one antioxidant used in the study) didn't decrease the incidence of lung cancer, and there was an 18% higher incidence of lung cancer among the men receiving beta carotene (another antioxidant). In medicine, theory is great but data trumps it every time.

Retread

February 14, 2008

Chemiotics: Introduction and allegro

[Editor’s note – a new guest contributor, Retread, has joined the team, and should be familiar to some of our readers...]

Feb 13, 2008

"Everything in Chemistry turns blue or explodes". Only a philosophy major in full hubristic cry could say that to his pre-med chemistry-major ex-roommate. There was some reality to it as the teacher of Chem 101, Dr Hubert N. Alyea, was really a small boy trapped in a professor suit, and usually blew something up in every lecture. Chemistry is still the Rodney Dangerfield of the sciences, important when the demand for taxol for breast cancer threatened to destroy every yew tree in sight, yet largely ignored by its progeny, biochemistry and molecular biology and the press.

Probably every nascent chemist suffers through things like this, but I got more than most, rooming with two philosophy majors as an undergraduate (one of whom was later a Rhodes). It definitely gives you both a thick skin and a more abstract cast of mind.

For who I am and my background go to ChemBark, scroll down the Categories section until you get to Rip Van Winkle open it up and start reading. This where I would have stayed, happily posting now and then and reading and responding to comments. However Paul has other fish to fry (probably his thesis) and ChemBark has developed a definite funereal cast in the 3 months since Paul's last post. Anyway, Paul got me started and gave me a forum, encouragement and advice, so I owe him at least a good dinner. Thanks Paul.

If contact with budding philosophers didn't make me somewhat reflective, then following the development of molecular biology from '62 to the present with the eye and background of a Woodward grad student and medical practice as a neurologist from '67 to '00 certainly was enough to do so. This is why future posts will be on things like:

1. Is there really such a thing as causality in cellular biochemistry and physiology?

2. Is organic chemistry easy or hard?
2a. If it's hard, is math harder?

3. Are there important chemical experiments which we can't do because the earth isn't big enough?

4. Is there really such a thing as control in chemical systems with feedback on every component (including the elements providing the feedback)?

5. Does the complexity of cellular chemistry and biochemistry raise questions about the adequacy of chance to bring it about?

That's for the future. The next post (probably a very long one because of the background required) will be on a recent spectacular paper which, if replicated and generally applicable, will revolutionize the way we think of the control of gene transcription. Thomas Kuhn where are you when we need you?

Stay tuned

Retread

Subscribe

Subscribe to this blog's feeds:

[What is this?]

Powered by
Movable Type 3.2