THE DISCWORLD VERSION OF DARWIN'S vision may not be quite what Roundworld's historians of science like to tell us, but the two will have been done converged on to the same timeline if the wizards manage to have will defeated the Auditors, so we can concentrate on the after-effects of that convergence. In any case some features are common to both versions of Darwinian history, including apes, beetles, and parasitic wasps. By contemplating these organisms, and many others - especially those con founded barnacles, of course - Darwin was led to his grand synthesis.
Today, no area of biology remains unaffected by the discovery of evolution. The evidence that today's species evolved from different ones, and that this process still continues, is overwhelming. Very little modern biology would make sense without the over-arching framework of evolution. If Darwin were reincarnated today, he would recognise many of his ideas, perhaps slightly reformulated, in the conventional scientific wisdom. The big principle of natural selection would be one of them. But he would also observe debate, perhaps even controversy, about this fundamental pillar of his thinking. Not whether natural selection happens, not whether it drives much of evolution; but whether it is the only driving force.
He would also find many new layers of detail filling some of the gaps in his theories. The most important and far-reaching of these is DNA, the magic molecule that carries genetic `information', the physical form of heredity. Darwin was sure that organisms could pass on their characteristics to their offspring, but he had no idea how this process was implemented, and what physical form it took. Today we are so familiar with the role of genes, and their chemical structure, that any discussion of evolution is likely to focus mainly on DNA chemistry. The role of natural selection, indeed the role of organisms, has been downgraded: the molecule has triumphed.
We want to convince you that it won't stay that way.
Evolution by natural selection, the great advance that Darwin and Wallace brought to public attention, is nowadays considered to be `obvious' by scientists of most persuasions and by most nonspecialists outside the US Bible Belt. This consensus has arisen partly because of a general perception that biology is `easy', it isn't a real, hard-to-understand science like chemistry or physics, and most people think that they know enough about it by a kind of osmosis from the general folk information. This assumption showed up amusingly at the Cheltenham Science Festival in 2001, when the Astronomer Royal Sir Martin Rees and two other eminent astronomers gave talks on `Life Out There'.
The talks were sensible and interesting, but they made no contact with real modem biology. They were based on the kind of biology that is currently taught in schools, most of which is about thirty years out of date. Like almost everything in school science, because it takes at least that long for ideas to `trickle down' from the research frontiers to the classroom. Most `modem mathematics' is at least 150 years old, so thirty-year-old biology is pretty good. But it's not what you should base your thinking on when discussing cutting-edge science.
Jack, in the audience, asked: `What would you think of three biologists discussing the physics of the black hole at the centre of the galaxy?' The audience applauded, seeing the point, but it took a couple of minutes for the scientists on the platform to understand the symmetry. They were then as contrite as they could be without losing their dignity.
This kind of thing happens a lot, because we are all so familiar with evolution that we think we understand it. We devote the rest of this section to a reasonable account of what the average person thinks about evolution. It goes like this.
Once upon a time there was a little warm pond full of chemicals, and they messed about a bit and came up with an amoeba. The amoeba's progeny multiplied (because it was a good amoeba) and some of them had more babies (something funny here ... ) and some had fewer, and some of them invented sex and had a much better time after that. Because biological copying wasn't very good in those days, all of their progeny were different from each other, carrying various copying mistakes called mutations.
Nearly all mutations were bad, on the principle that putting a bullet randomly through a piece of complex machinery is unlikely to improve its performance, but a few were good. Animals with good mutations had many more babies, and those had the good mutation too, so they thrived and bred. Their progeny carried the good mutation into the future. However, many more bad mutations accumulated, so natural selection killed those off. Luckily, another new mutation appeared, which made a new character for a new species (better eyes, or swimming fins, .or scales), which was altogether better and took over.
These later species were fishes, and one of them came out on land, growing legs and lungs to do so. From these first amphibians arose the reptiles, especially the dinosaurs (while the unadventurous fishes were presumably just messing about in the sea for millions of years, waiting to be fish and chips). There were some small, obscure mammals, who survived by coming out at night and eating dinosaur eggs. When the dinosaurs died, the mammals took over the planet, and some evolved into monkeys, then apes, then Stone Age people.
Then evolution stopped, with amoebas in ponds content to remain amoebas and not wanting to be fishes, fishes not wanting to be dinosaurs but just living their little fishy lives, the dinosaurs wiped out by a meteorite. The monkeys and apes, having seen what it was like to be at the peak of evolution, are now just slowly dying out - except in zoos, where they are kept to show us what our progenitors used to be like. Humans now occupy the top branch of the tree of life: since we are perfect, there's nowhere for evolution to go any more, which is why it has stopped.
If pressed for more detail, we dredge up various things we've learned, mostly from newspapers, about things called genes. Genes are made from a molecule called DNA, which takes the form of a double helix and contains a kind of code. The code specifies how to make that kind of organism, so human DNA contains the information needed to make a human, whereas cat DNA contains the information for a cat, and so on. Because the DNA helix is double, it can be split apart, and the separate parts can easily be copied, which is how living creatures reproduce. DNA is the molecule of life, and without it, life would not exist. Mutations are mistakes in the DNA copying process - typos in the messages of life.
Your genes specify everything about you - whether you'll be homosexual or heterosexual, what kinds of diseases you will be susceptible to, how long you will live ... even what make of car you will prefer. Now that science has sequenced the human genome, the DNA sequence for a person, we know all of the information required to make a human, so we know everything there is to know about how human beings work.
Some of us will be able to add that most DNA isn't in the form of genes, but is just `junk' left over from some distant part of our evolutionary history. The junk gets a free ride on the reproductive roller-coaster, and it survives because it is `selfish' and doesn't care what happens to anything except itself.
Here ends the folk view of evolution. We've parodied it a little, but not by as much as you might hope. The first part is a lie-to-children about natural selection; the second part is uncomfortably close to `neo-Darwinism', which for most of the past 50 years has been the accepted intellectual heir to The Origin. Darwin told us what happens in evolution; neo-Darwinism tells us how it happens, and how it happens is DNA.
There's no question that DNA is central to life on Earth. But virtually every month, new discoveries are being made that profoundly change our view of evolution, genetics, and the growth and diversification of living creatures. This is a vast topic, and the best we can do here is to show you a few significant discoveries and explain why they are significant.
Just as physics replaced Newton by Einstein, there has been a major revolution in the basic tenets of biology, so we now have a different, more universal view of what drives evolution. The `folk' evolutionary viewpoint: `I've got this new mutation. I have become a new kind of creature. Is it going to do me any good?' is not the way modern biologists think.
There are many things wrong with our folk-evolution story. In fact we've deliberately constructed it so that every single detail is wrong. However, it's not very different from many accounts in popular science books and television programmes. It assumes that primitive animals alive today are our ancestors, when they are our cousins. It assumes that we `came from' apes, when of course the ape-like ancestor of man is the same creature as the man-like ancestor of modem apes. More seriously, it assumes that mutations in the genetic material, the changes that natural selection has to work on - indeed, to select among - are checked out as soon as they appear, and labelled `bad' (the organism dies, or at least fails to breed) or `good' (the animal contributes its progeny to the future).
Until the early 1960s, that was what most biologists thought too. Indeed, two very famous biologists, J.B.S. Haldane and Sir Ronald Fisher, produced important papers in the mid-1950s espousing just that view. In a population of about 1000 organisms, they believed, only about a third of the breeding population could be `lost' to bad gene variants, or could be ousted by organisms carrying better versions, without the population moving towards extinction. They calculated that only about ten genes could have variants (known as `alleles') that were increasing or decreasing as proportions of the population. Perhaps twenty genes might be changing in this way if they were not very different in `fitness' from the regular alleles. This picture of the population implied that almost all organisms in a given species must have pretty much the same genetic make-up, except for a few which carried the good alleles coming in, and winning, or the bad alleles on the way out[48]. These exceptions were mutants, famously and stupidly portrayed in many SF films.
However, in the early 1960s Richard Lewontin's group exploited a new way to investigate the genetics of wild (or indeed any) organisms. They looked at how many versions of common proteins they could find in the blood, or in cell extracts. If there was just one version, the organism had received the same allele from both of its parents: the technical term here is `homozygous'. If there were two versions, it had received different ones from each parent, and so was `heterozygous'.
What they found was totally incompatible with the Fisher-Haldane picture.
They found, and this has been amply confirmed in thousands of wild populations since, that in most organisms, about ten percent of genes are heterozygous. We now know, thanks to the Human Genome Project, that human beings have about 34,000 genes. So about 3400 are heterozygous, in any individual, instead of the ten or so predicted by Haldane and Fisher.
Furthermore, if many different organisms are sampled, it turns out that about one-third of all genes have variant alleles. Some are rare, but many of them occur in more than one per cent of the population.
There is no way that this real-world picture of the genetic structure of populations can be reconciled with the classical view of population genetics. Nearly all current natural selection must be discriminating between different combinations of ancient mutations. It's not a matter of a new mutation arriving and the result being immediately subjected to selection: instead, that mutation must typically hang around, for millions of years, until eventually it ends up playing a role that makes enough of a difference for natural selection to notice, and react.
With hindsight, it is now obvious that all currently existing breeds of dog must have been 'available'- in the sense that the necessary alleles already existed, somewhere in the population - in the original domesticated wolves. There simply hasn't been time to accumulate the necessary mutations purely in modern dogs. Darwin knew about the amount of cryptic and overt variation in pigeons, too. But his successors, hot on the trail of the molecular basis of life, forgot about wolves and pigeons. They pretty much forgot about cells. DNA was complicated enough: cell biology was impossible, and as for understanding an organum ...
Lewontin's discovery was a significant turning point in our understanding of heredity and evolution. It was at least as radical as the much better publicised revolution that replaced Newton's physics with Einstein's, and it was arguably more important. We will see that in the last year or so there has been another, even more radical, revision of our thinking about the control of cell biology and development by the genes. The whole dogma about DNA, messenger RNA, and proteins has been given a reality check, and science's internal `auditors' have rendered it as archaic as Fisher's population genetics.
It is commonly assumed - not only by the average television producer of pop science half-hours, but also by most popular science book authors - that now we know about DNA, the `secret of life', evolution and its mechanisms are an open book. Soon after the discovery of DNA's structure and mechanism of replication by James Watson and Francis Crick, in the late 1950s, the media - and biology textbooks at all levels - were beginning to refer to it as the `Blueprint for Life'. Many books, culminating with Dawkins's The Selfish Gene in the 1970s, promoted the view that by knowing about the mechanism of heredity, we had found the key to all of the important puzzles of biology and medicine, especially evolution.
There was soon to be a major tragedy, resulting from a medical application of that mistaken view. The sedative thalidomide was increasingly being prescribed, and bought over the counter, to treat nausea and other minor discomforts of the early weeks of pregnancy.
Only later was it discovered that in a small proportion of cases, thalidomide could cause a type of birth defect known as phocomelia, in which arms and legs are replaced by partially developed versions that resemble a seal's flippers.
It took a while for anyone to notice, partly because few general practitioners had experience of phocomelia before 1957. In fact, very few of them had ever seen a case at all, but after 1957 they began to see two or three in a year. A second reason was that it was very difficult to tie this defect to a particular potion or treatment: pregnant women famously take a great variety of dietary additives, and often they don't remember precisely what they've taken. Nevertheless, by 1961 some medical detective work had tied the spate of phocomelia down to thalidomide.
American doctors congratulated themselves on having missed out on the pathology, because Frances Kelsey, a medical worker for the Food and Drug Administration, had expressed misgivings about the original animal testing of the drug. Her misgivings eventually turned out to have been unfounded, but they did save much suffering in the USA. She noticed that the drug had not been tested on pregnant animals, because at that time such tests were not required. Everyone knew that the embryo has its own blueprint for development, quite separate from that of the mother. However, embryologists trained in biology departments, as distinct from medical embryologists, knew about the work of Cecil Stockard, Edward Conklin, and other embryologists of the 1920s. They had shown that many common chemicals could caused monstrous developmental defects. For instance, lithium salts easily induced cyclopia, a single central eye, in fish embryos. These alternative developmental paths, induced by chemical changes, have taught us a lot about the biological development of organisms, and how it is controlled.
They have also taught us that an organism's development is not rigidly determined by the DNA of its cells. Environmental insults can push the course of development along pathological paths. In addition, the genetics of organisms, particularly wild organisms, are usually organised so that `normal' development happens despite a variety of environmental insults, and even despite changes in some of the genes. This so-called `canalised' development is very important for evolutionary processes, because there are always temperature variations, chemical imbalances and assaults, parasitic bacteria and viruses; the growing organism must be `buffered' against these variations. It must have versatile developmental paths to ensure that the `same' well-adapted creature is produced, whatever the environment is doing. Within reasonable limits, at any rate.
There are many developmental tactics and strategies that help to accomplish this. They range from simple tricks like the HSP90 protein to the very clever mammalian trade-off.
HSP stands for `heat shock protein'. There are about 30 of these proteins, and they are produced in most cells in response to a sudden, not very severe, change of temperature. A different array of proteins is produced in response to other shocks; this one is called HSP90 because of where it sits in a much longer list of cell proteins. HSP90, like most HSPs, is a chaperonin: its job is to hug other proteins during their construction, so that when the long line of amino acids folds up it achieves the `right' shape. HSP90 is very good at making the `right' shape - even if the gene that specifies the chaperoned protein has accumulated a lot of mutations. So the resulting organism doesn't `notice' the mutations; the protein is `normal' and the organism looks and behaves just like its ancestral form.
However, if there's a heat shock or other emergency during development, HSP90 is diverted from its role as chaperonin, and other less powerful chaperonins permit the mutational differences to be expressed in most of the progeny. The effect this has on evolution is to keep the organisms much the same until there's an environmental stress, when suddenly, in one generation, lots of previously hidden, but hereditable, variation appears.
Most books that describe evolution seem to assume that every time there's a mutation, the environment promptly gets to judge it good or bad ... but one little trick, HSP90, which is present in most animals and many bacteria, makes nonsense of that assertion. And from Lewontin's discovery that a third of genes have common variants in wild populations, and that all organisms carry lots of them, it is clear that ancient mutations are continually being tested in different modern combinations, while the potential effects of more recent mutations are being cloaked by HSP90 and its ilk.
The trick employed by mammals is much more complex and farreaching. They reorganised their genes, and got rid of a lot of genetic complication that their amphibian ancestors relied on, by adopting a new and more controlled developmental strategy. Most frogs and fishes, whose eggs usually encounter great differences and changes of temperature during each embryology, ensure that the `same' larva, and then adult, results. Think of frog spawn in a frozen English pond, warming up to 35°C during the day while the delicate early development proceeds; then the little hatchling tadpoles have to endure these temperature changes. Now think of the frogs that so few of the tadpoles become.
Most chemical reactions, including many biochemical ones, happen at different rates if the temperature is different. You only get a frog if all the different developmental processes fit together effectively, and timing is crucial. So how does frog development work at all, given that the environment is changing so quickly and repeatedly?
The answer is that the frog genome `contains' many different contingency plans, for many different environmental scenarios. There are many different versions of each of the enzymes and other proteins that frog development requires. All of them are put into the egg while it is in mother frog's ovary. There are perhaps as many as ten versions of each, appropriate to different temperatures (fast enzymes for low temperatures, sluggish ones for higher temperatures, to keep the duration of development much the same)[49], and they have `labels' on the packages that make them, so the embryo can choose which one to use according to its temperature. Animals whose development must be buffered in this way use a lot of their genetic programme to set up contingency plans for many other variables, in addition to temperature.
The mammals cleverly avoided all of this faffing around, by making their females thermostatically controlled -'warm-blooded'. What counts is not the warmth of the blood, but the system that maintains it at a constant temperature. The beautifully controlled uterus keeps all kinds of other variables away from the embryos, too, from poisons to predators. It probably `costs' much less in DNA programming to adopt this strategy, too.
This trick, evolved by the mammals, carries an important message. To ask how much information passes across the generations in the DNA blueprint, as textbooks and sophisticated research manuals often do, is to miss the point. How the genes and proteins are used is far more important, and far more interesting, than how many genes or proteins there are in a given creature. Lungfishes and some salamanders, even some amoebas, have more than fifty times as much DNA as we mammals do. What does this say about how complex these creatures are, compared to us?
Absolutely nothing.
Tricks like HSP90, and strategies like warm-bloodedness and keeping development inside the mother, mean that bean-counting of DNA `information' is beside the point. What counts is what the DNA means, not how big it is. And meaning depends on context, as well as content: you can't regulate the temperature of a uterus unless your context (that is, mother) provides one.
The simple-minded `mutation' viewpoint, allied to trendy interpretations of DNA function in terms of `information theory', is often allied with ignorance of biology in other areas. One example is radiation biology and simple ecology as seen by `conservation activists'. Some of these volunteers found five-legged frogs and other 'monsters' downwind of the Chernobyl site, years after the nuclear accident but while radiation levels were still noticeably high. They claimed that the monsters were mutants, caused by the radiation. Other workers, however, then found just as many supposed mutants upwind of the reactor site.
It turned out that the best explanation had nothing to do with mutant frogs. It was the absence of their usual predators, owls and hawks and snakes, because there were so many humans trudging about. Rana palustris tadpoles from Chernobyl produced no more of these pathologies than did other frogspawn samples from ponds some tens of kilometres away that had not been subjected to radiation, when a high percentage of both was allowed to survive. Usually, in British Rana temporaria frogs, it is very difficult to achieve ten per cent normal adults, or even ones that are viable in the laboratory, but they don't produce extra limbs as palustris does. It is normally the case, of course, that a female frog's lifetime production of some 10,000 eggs results in a few highly selected, and therefore `normal', survivors, and on average just two breeders. But conservationists don't like thinking about this reproductive arithmetic, with all those deaths.
Here is another issue, again chosen from the thalidomide literature, that demonstrates how talk of Lamarckism, or of `mutations', misses the point.
Some of the children affected by thalidomide have married each other, and several of these pairings have produced phocomelic children. The obvious deduction, from the folk-DNA point of view, is that the DNA of the first generation must have been altered, so that it produced the same effect in the next generation. In fact, this effect looks, at first glance, like Lamarckism: the inheritance of acquired characters. Indeed, it seems a classic demonstration of such inheritance, as convincing as if cutting off terriers' tails resulted in puppies being born with short tails. However, it is actually a lesson in not attempting to explain things `at first glance', like the conservationists did with the abnormal frogs.
It is very tempting to do just that, when the idea of heredity in your mind is that one gene leads to one character, so if you've got the character you've got the gene, and vice versa. Figures from the epidemiological literature suggest that in the space of a few years either side of 1960, about 4 million women took thalidomide at the critical time during gestation. Of those, about 15,000-18,000 foetuses were damaged; 12,000 came to birth with defects, and about 8,000 survived their first year. That is to say, the natural course of development selected just 1 in 500 who showed adverse effects. The proportion of children born with no detectable defect was much, much higher. And that fact changes our view of the likely reason for the children of two thalidomide parents to suffer from phocomelia, for the following reason.
Conrad Waddington demonstrated a phenomenon called `genetic assimilation'. He started with a genetically diverse population of wild fruit flies, and found that about one in 15,000 of their pupae, when warmed, produced a fly with no cross-vein in its wing. These 'crossveinless' flies looked just like some very rare mutant flies that turned up occasionally in the wild, just as occasional genetically phocomelic children turned up before thalidomide. By breeding from the flies that responded to the treatment, Waddington selected for a lower and lower threshold of response. In a few tens of generations, he had selected flies that bred true for the cross-veinless trait, exhibiting it regularly without anyone warming the pupae. This may look like Lamarckian inheritance, but it's not. It's genetic assimilation. The experiments were selecting flies that had no cross-vein at lower and lower temperature thresholds. Eventually, they selected flies that had no cross-vein at `normal' temperatures.
Similarly, genetic assimilation provides a much better explanation than Lamarckism for the phocomelic children of thalidomidemodified parents. We have selected, from some 4 million foetuses, those that respond to thalidomide with phocomelia. It is not surprising that when they marry each other, they produce a few progeny whose threshold is very low - below zero in fact. They are so liable to produce phocomelia that they do it without thalidomide, just as Waddington's flies came to produce cross-veinlessness without warming the pupae.
One of the things that really worried Darwin was the existence of parasitic wasps - a fact that has influenced our Discworld tale, but has gone unremarked until now in the scientific commentaries. Parasitic wasps lay their eggs in other insects' larvae, so that as the wasp eggs grow into wasp larvae, they eat their hosts. Darwin could see how this might have happened on evolutionary grounds, but it seemed to him to be rather immoral. He was aware that wasps don't have a sense of morality, but he saw it as some kind of flaw on the part of the wasps' creator. If God designed each species on Earth, for a special purpose - which is what most people believed at the time - then God had deliberately designed parasitic wasps, whose purpose was to eat other species of insect, also designed by God. To be so eaten, presumably.
Darwin was fascinated by such wasps, ever since he first encountered them in Botafogo Bay, Brazil. He eventually satisfied himself - though not his successors - that God had found it necessary to permit the existence and evolution of parasitic wasps in order to get to humans. This is what the quote at the end of Chapter 10 alludes to. That particular explanation has fallen out of favour among biologists, along with all theist interpretations. Parasitic wasps exist because there is something for them to parasitise - so why not? Indeed, parasitic wasps play a major role in controlling many other insect populations: nearly one-third of all of the insect populations that humans like to label `pests' are kept at bay in this manner. Maybe they were created in order for humans to be possible ... At any rate, the wasps that so puzzled Darwin still have much to tell us, and the latest discovery about them threatens to overturn several cherished beliefs.
Strictly, the discovery is not so much about the wasps, as about some viruses that infect them ... or are symbiotic with them. They are called polydnaviruses.
When mother wasp injects her eggs into some unsuspecting larva, such as a caterpillar, she also injects a solid dose of viruses, among them said polydnaviruses. The caterpillar not only gets a parasite, it gets an infection. The virus's genes produce proteins that interfere with the caterpillar's own immune system, stopping it reacting to the parasite and, perhaps, rejecting it. So the wasp larvae munch merrily away on the caterpillar, and in the fullness of time they develop into adult wasps.
Now, any self-respecting adult parasitic wasp obviously needs its own complement of polydnaviruses. Where does it get them? From the caterpillar that it fed on. And it gets them (just as mother did) not as a separate infective `organism', but as what is called a provirus: a DNA sequence that has been integrated into the wasp's own genome.
Many genomes, probably most if not all, include various bits of viruses in this way. Our own certainly does. Transport of DNA by viruses seems to have been an important feature of evolution.
In 2004 a team headed by Eric Espagne worked out the DNA sequence of a polydnavirus - as one does - and what they found was dramatically different from what anyone had expected. Typical virus genomes are very different from those of 'eukaryotes'- organisms whose cells have a nucleus, which includes most multi-cellular creatures and many single-celled ones, but not bacteria. The DNA sequences of most eukaryote genes consist of `exons', short sequences that collectively code for proteins, separated by other sequences called introns, which get snipped out when the code is turned into the appropriate protein. Viral genes are relatively simple, and typically they do not contain introns. They consist of connected code sequences that specify proteins.-This particular polydnavirus genome, in contrast, does contain introns, quite a lot of them. The genome is complex, and looks much more like a eukaryote genome than a virus genome. The authors conclude that polydnavirus genomes constitute `biological weapons directed by the wasps against their hosts'. So they look more like the enemy's genome than that of an ordinary virus.
Numerous examples, old and new, disprove every aspect of the folk version of evolution and DNA. We end with one that looks especially important, discovered very recently, and whose significance is just becoming seriously apparent to the biological community. It is probably the most severe shock that cell biology has received since the discovery of DNA and the wonderful `central dogma': DNA specifies messenger-RNA which specifies proteins. The discovery was not made through some big, highly publicised research programme like the human genome project. It was made by someone who wondered why his petunias had gone stripy. When all the world is chasing `the' human genome, it's not easy to get research grants to work on stripy petunias. But what the petunias revealed is probably going to be far more important for medicine than the entire human genome project.
Because proteins are the structure of living creatures, and because as enzymes they control the processes of life, it has seemed obvious that DNA controls life, that we can `map' DNA code on to all the important living functions. We could assign a function to each protein, so we could assume that the DNA that coded for that protein was ultimately or fundamentally responsible for the corresponding function. Dawkins's early books reinforced the idea of one gene, one protein, one function (although he carefully warned his readers that he didn't want to give that impression), and this encouraged such media exaggerations as calling the human genome the Book of Life. And the `selfish gene' image made it entirely credible that huge stretches of the genome were present for solely selfish reasons - that is, for no reason related to the organism concerned.
Biologists employed - as so many now are - in the biotechnology industries serving agriculture, pharmacy, medicine, even some engineering projects (we don't mean just `genetic engineering' but making better motor oils), all subscribe to the central dogma, with a few minor modifications and exceptions. All of them have been informed that nearly all of the DNA in the human genome is `junk', not coding for proteins, and that although some of it may be important for developmental processes or for controlling some of the `real' genes, they really don't need to worry about it.
Admittedly, quite a lot of junk DNA seems to be transcribed into RNA, but these are just short lengths that sit about briefly in the cell fluids and don't need to be considered when you're doing important proteinmaking things with the real genes. Recall that the DNA sequences of real genes consist of a mosaic of `exons' which code for proteins, separated by other sequences called introns. The introns have to be cut out of the RNA copies to get the `real' protein-coding sequences, called messenger-RNAs, which lace into ribosomes like tapes into a tape player. Messenger-RNAs determine what proteins get made, and they have sequences on their ends that label them for making many copies of a protein or for destruction after only a couple of protein molecules.
Nobody worried much about those snipped-out introns, just bits of RNA drifting aimlessly around in the cell till they got broken up by enzymes. Now, they do worry. Writing in the October 2004 Scientific American, John Mattick reports that The central dogma is woefully incomplete for describing the molecular biology of eukaryotes. Proteins do play a role in the regulation of eukaryotic gene expression, yet a hidden, parallel regulatory system consisting of RNA that acts directly on DNA, RNAs and proteins is also at work. This overlooked RNA signalling network may be what allows humans, for example, to achieve structural complexity far beyond anything seen in the unicellular world.
Petunias made that clear. In 1990 Richard Jorgensen and colleagues were trying to breed new varieties of petunias, with more interesting, brighter colours. An obvious approach was to engineer into the petunia genome some extra copies of the gene that coded for an enzyme involved in the production of pigment. More enzyme, more pigment, right?
Wrong.
Less pigment?
No, not exactly. What previously was a uniformly coloured petal became stripy. In some places the pigment was being produced, elsewhere it wasn't. This effect was so surprising that plant biologists tried to find out exactly why it was happening. And what they found was `RNA interference'. Certain RNA sequences can shut down a gene, prevent it making protein. It happens in many other organisms, too. In fact, it is extremely widespread. And it suggests something extraordinarily important.
The big question in this area, asked many times and largely ignored, has always been: if introns (which occupy all but one-twentieth of a typical protein-coding gene) have no biological function, why are they there? It is easy to dismiss them as relics of some dim evolutionary past, no longer useful, lying around because natural selection can't get rid of things that are harmless. Even so, we can still wonder whether introns are present because they do have some useful function, one that we haven't yet worked out. And it's starting to look as if that may be the case.
For a start, introns are not that ancient. It now seems that they became incorporated into the human genome relatively recently. They are probably related to mobile genetic elements known as group II introns, which are a `parasitic' form of DNA that can invade host genomes and then remove themselves when the DNA is expressed as RNA. Moreover, they now seem to have a role as 'signals' in the regulation of genetic processes. An intron may be relatively short, compared to the long protein-coding sequences that arise when the introns are snipped out, but a short signal has advantages and can do quite a lot. In effect, the introns may be genetic 'txt msgs' in the mobile phone of life. Short, cheap, and very effective. An RNA-based `code', running parallel to the DNA double helix, can affect the activity of the cell very directly. An RNA sequence can act as a very specific, well-defined signal, directing RNA molecules to their targets in RNA or DNA.
The evidence for the existence of such a signalling system is reasonable, but not yet undeniable. If such a system does exist, it clearly has the potential to resolve many biological mysteries. A big puzzle about the human genome is that its 34,000 genes manage to encode over 100,000 proteins. Clearly `one gene one protein' doesn't work. A hidden RNA signalling system could make one gene produce several proteins, depending on what the accompanying RNA signal specified. Another puzzle is the complexity of eukaryotes, especially the Cambrian explosion of 525 million years ago, when the range of terrestrial body-plans suddenly diversified out of all recognition; indeed, was more diverse than it is now. Perhaps the hypothetical RNA signalling system started to take off at that time. And it's widely known that the human and chimpanzee genomes are surprisingly similar (though the degree of similarity seems not to be 98 per cent as widely quoted even a few years ago). If our RNA signals are significantly different, that would be one way to explain why humans don't greatly resemble chimps.
At any rate, it very much looks as if all that `junk' DNA in your genome is not junk at all. On the contrary, it may be a crucial part of what makes you human.
This lesson is driven home by those business associates of parasitic wasps, the symbiotic polydnaviruses, sneakily buried inside the wasps' own DNA. There is a message there about human evolution, and it's a very strange one.
Genome-sequencing may have been oversold as the answer to human diseases, but it's very good basic science. The activities of the sequencers have revealed that wasps are not the only organisms to have bits of viral DNA hanging around in their genomes. In fact, most creatures do, humans included. The human genome even contains one complete viral genome, and only one, called ERV-3 (Endogenous RetroVirus). This may seem an evolutionary oddity, a bit of `junk DNA' that really is junk ... but, actually, without it none of us would be here. It plays the absolutely crucial role of preventing rejection of the foetus by the mother. Mother's immune system `ought to' recognise the tissue of a developing baby as `foreign', and trigger actions that will get rid of it. By `ought to' here we mean that this is what the immune system normally does for tissue that is not the mother's own.
Apparently, the ERV-3 protein closely resembles another one called p15E, which is part of a widespread defence system used by viruses to stop their hosts killing them off. The p15E protein stops lymphocytes, a key type of cell in the immune system, from responding to antigens, molecules that reveal the virus's foreign nature. At some stage during mammalian evolution, this defence system was stolen from the viruses and used to stop the female placenta responding to antigens that reveal the foreign nature of the foetus's father. Perhaps on the principle of being hung for a sheep as well as for a lamb, the human genome decided to go the whole hog[50] and steal the entire retroviral genome.
When evolution carried out the theft, however, it did not just dump its booty into the human DNA sequence unchanged. It threw in a couple of introns, too, splitting ERV-3 into several separate pieces. It's complete, but not connected. No matter: enzymes can easily snip out the introns when that bit of DNA is turned into protein. But no one knows why the introns are there. They might be an accidental intrusion. Or - pursuing the RNA interference idea - they might be much more significant. Those introns might be an important part of the genetic regulatory system, `text messages' that let the placenta use ERV-3 without running the risk of setting the corresponding virus loose.
At any rate, whatever the introns are for, warm-bloodedness is not the only trick that mammalian evolution managed to find and exploit.
It also indulged in wholesale theft of a virus's genome, to stop mother's immune system booting out baby because it `smelled' of father. And we also get another lesson that DNA isn't selfish. ERV-3 is present in the human genome, but not because it's a bit of junk that gets copied along with everything else and has remained because it does no harm. It's there because, in a very real sense, humans could not survive - could not even reproduce - without it.