The Science of Discworld II - The Globe (Terry Pratchett) читать бесплатно онлайн полную версию книги

18. BIT FROM IT

A semaphore is a simple and time-honoured example of a digital communication system. It encodes letters of the alphabet using the positions of flags, lights, or something similar. In 1795

George Murray invented a version that is close to the system currently used in Discworld: a set of six shutters that could be opened or closed, thus giving 64 different 'codes', more than enough for the entire alphabet, numbers 0 to 10 and some 'special' codes. The system was further developed but ceased to be cutting-edge technology when the electric telegraph heralded the wired age. The Discworld semaphore (or 'clacks') has been taken much further, with mighty trunk route towers carrying bank after bank of shutters, aided by lamps after dark, and streaming messages bi-directionally across the continent. It is a pretty accurate 'evolution' of the technology: if we too had failed to harness steam and electricity, we might well be using something like it ...

There is enough capacity on that system even to handle pictures -seriously. Convert the picture to a 64 x 64 grid of little squares that can be black, white or four shades of grey, and then read the grid from left to right and top to bottom like a book. It's just a matter of information, a few clever clerks to work out some compression algorithms, and a man with a shallow box holding

4,096 wooden blocks, their six sides being, yes, black, white and four shades of grey. It'll take them a while to reassemble the pictures, but clerks are cheap.

Digital messages are the backbone of the Information Age, which is the name we currently give to the one we're living in, in the belief that we know a lot more than anyone else, ever. Discworld is comparably proud of being in the Semaphore Age, the Age of the Clacks. But what, exactly, is information?

When you send a message, you are normally expected to pay for it -because if you don't, then whoever is doing the work of transmitting that message for you will object. It is this feature of messages that has got Ridcully worried, since he is wedded to the idea that academics travel free.

Cost is one way to measure things, but it depends on complicated market forces. What, for example, if there's a sale on? The scientific-concept of 'information' is a measure of how much message you're sending. In human affairs, it seems to be a fairly universal principle that for any given medium, longer messages cost more than short ones. At the back of the human mind, then, lurks a deep-seated belief that messages can be quantified: they have a size. The size of a message tells you 'how much information' it contains.

Is 'information' the same as 'story'? No. A story does convey information, but that's probably the least interesting thing about stories. Most information doesn't constitute a story. Think of a telephone directory: lots of information, strong cast, but a bit weak on narrative. What counts in a story is its meaning. And that's a very different concept from information.

We are proud that we live in the Information Age. We do, and that's the trouble. If we ever get to the Meaning Age, we'll finally understand where we went wrong.

Information is not a thing, but a concept. However, the human tendency to reify concepts into things has led many scientists to treat information as if it is genuinely real. And some physicists are starting to wonder whether the universe, too, might be made from information.

How did this viewpoint come about, and how sensible is it?

Humanity acquired the ability to quantify information in 1948, when the mathematician-turnedengineer Claude Shannon found a way to define how much information is contained in a message -he preferred the term signal -sent from a transmitter to a receiver using some kind of code. By a signal, Shannon meant a series of binary digits ('bits', 0 and 1) of the kind that is ubiquitous in modern computers and communication devices, and in Murray's semaphore. By a code, he meant a specific procedure that transforms an original signal into another one. The simplest code is the trivial 'leave it alone'; more sophisticated codes can be used to detect or even correct transmission errors. In the engineering applications, codes are a central issue, but for our purposes here we can ignore them and assume the message is sent 'in plain'.

Shannon's information measure puts a number to the extent to which our uncertainty about the bits that make up a signal is reduced by what we receive. In the simplest case, where the message is a string of 0s and 1s and every choice is equally likely, the amount of information in a message is entirely straightforward: it is the total number of binary digits. Each digit that we receive reduces our uncertainty about that particular digit (is it 0 or 1?) to certainty ('it's a 1', say) but tells us nothing about the others, so we have received one bit of information. Do this a thousand times and we have received a thousand bits of information. Easy.

The point of view here is that of a communications engineer, and the unstated assumption is that we are interested in the bit-by-bit content of the signal, not in its meaning. So the message

111111111111111 contains 15 bits of information, and so does the message 111001101101011.

But Shannon's concept of information is not the only possible one. More recently, Gregory Chaitin has pointed out that you can quantify the extent to which a signal contains patterns. The way to do this is to focus not on the size of the message, but on the size of a computer program, or algorithm, that can generate it. For instance, the first of the above messages can be created by the algorithm 'every digit is a 1'. But there is no simple way to describe the second message, other than to write it down bit by bit. So these two messages have the same Shannon information content, but from Chaitin's point of view the second contains far more 'algorithmic information'

than the first.

Another way to say this is that Chaitin's concept focuses on the extent to which the message is

'compressible'. If a short program can generate a long message, then we can transmit the program instead of the message and save time and money. Such a program 'compresses' the message.

When your computer takes a big graphics file -a photograph, say -and turns it into a much smaller file in JPEG format, it has used a standard algorithm to compress the information in the original file. This is possible because photographs contain numerous patterns: lots of repetitions of blue pixels for the sky, for instance. The more incompressible a signal is, the more information in Chaitin's sense it contains. And the way to compress a signal is to describe the patterns that make it up. This implies that incompressible signals are random, have no pattern, yet contain the most information. In one way this is reasonable: when each successive bit is maximally unpredictable, you learn more from knowing what it is. If the signal reads

111111111111111 then there is no great surprise if the next bit turns out to be 1; but if the signal reads 111001101101011 (which we obtained by tossing a coin 15 times) then there is no obvious guess for the next bit.

Both measures of information are useful in the design of electronic technology. Shannon information governs the time it takes to transmit a signal somewhere else; Chaitin information tells you whether there's a clever way to compress the signal first, and transmit something smaller. At least, it would do if you could calculate it, but one of the features of Chaitin's theory is that it is impossible to calculate the amount of algorithmic information in a message -and he can prove it. The wizards would approve of this twist.

'Information' is therefore a useful concept, but it is curious that 'To be or not to be' contains the same Shannon information as, and less Chaitin information than, 'xyQGRlfryu&d°/oskOwc'. The reason for this disparity is that information is not the same thing as meaning. That's fascinating.

What really matters to people is the meaning of a message, not its bit-count, but mathematicians have been unable to quantify meaning. So far.

And that brings us back to stories, which are messages that convey meaning. The moral is that we should not confuse a story with 'information'. The elves gave humanity stories, but they didn't give them any information. In fact, the stories people came up with included things like werewolves, which don't even exist on Roundworld. No information there -at least, apart from what it might tell you about the human imagination.

Most people, scientists in particular, are happiest with a concept when they can put a number to it. Anything else, they feel, is too vague to be useful. 'Information' is a number, so that comfortable feeling of precision slips in without anyone noticing that it might be spurious. Two sciences that have gone a long way down this slippery path are biology and physics.

The discovery of the linear molecular structure of DNA has given evolutionary biology a seductive metaphor for the complexity of organisms and how they evolve, namely: the genome of an organism represents the information that is required to construct it. The origin of this metaphor is Francis Crick and James Watson's epic discovery that an organism's DNA consists of 'code words' in the four molecular 'letters' A C T G, which, you'll recall, are the initials of the four possible 'bases'. This description led to the inevitable metaphor that the genome contains information about the corresponding organism. Indeed, the genome is widely described as

'containing the information needed to produce' an organism.

The easy target here is the word 'the'. There are innumerable reasons why a developing organism's DNA does not determine the organism. These non-genomic influences on development are collectively known as 'epigenetics', and they range from subtle chemical tagging of DNA to the investment of parental care. The hard target is 'information'. Certainly, the genome includes information in some sense: currently an enormous international effort is being devoted to listing that information for the human genome, and also for other organisms such as rice, yeast, and the nematode worm Caenorhabditis elegans. But notice how easily we slip into cavalier attitudes, for here the word 'information' refers to the human mind as receiver, not to the developing organism. The Human Genome Project informs us, not organisms.

This flawed metaphor leads to the equally flawed conclusion that the genome explains the complexity of an organism in terms of the amount of information in its DNA code. Humans are complicated because they have a long genome that carries a lot of information; nematodes are less complicated because their genome is shorter. However, this seductive idea can't be true. For example, the Shannon information content of the human genome is smaller by several orders of magnitude than the quantity of information needed to describe the wiring of the neurons in the human brain. How can we be more complex than the information that describes us? And some amoebas have much longer genomes than ours, which takes us down several pegs as well as casting even more doubt on DNA as information.

Underlying the widespread belief that DNA complexity explains organism complexity (even though it clearly doesn't) are two assumptions, two scientific stories that we tell ourselves. The first story is DNA as Blueprint, in which the genome is represented not just as an important source of control and guidance over biological development, but as the information needed to determine an organism. The second is DNA as Message, the 'Book of Life' metaphor.

Both stories oversimplify a beautifully complex interactive system. DNA as Blueprint says that the genome is a molecular 'map' of an organism. DNA as Message says that an organism can pass that map to the next generation by 'sending' the appropriate information.

Both of these are wrong, although they're quite good science fiction -or, at least, interestingly bad science fiction with good special effects.

If there is a 'receiver' for the DNA 'message' it is not the next generation of the organism, which does not even exist at the time the 'message' is being 'sent', but the ribosome, which is the molecular machine that turns DNA sequences (in a protein-coding gene) into protein. The ribosome is an essential part of the coding system; it functions as an 'adapter', changing the sequence information along the DNA into an amino acid sequence in proteins. Every cell contains many ribosomes: we say 'the' because they are all identical. The metaphor of DNA as information has become almost universal, yet virtually nobody has suggested that the ribosome must be a vast repository of information. The structure of the ribosome is now known in high detail, and there is no sign of obvious 'information-bearing' structure like that in DNA. The ribosome seems to be a fixed machine. So where has the information gone? Nowhere. That's the wrong question.

The root of these misunderstandings lies in a lack of attention to context. Science is very strong on content, but it has a habit of ignoring 'external' constraints on the systems being studied.

Context is an important but neglected feature of information. It is so easy to focus on the combinatorial clarity of the message and to ignore the messy, complicated processes carried out by the receiver when it decodes the message. Context is crucial to the interpretation of messages: to their meaning. In his book The User Illusion Tor Norretranders introduced the term exformation to capture the role of the context, and Douglas Hofstadter made the same general point in Godel, Escher, Bach. Observe how, in the next chapter, the otherwise incomprehensible message 'THEOSTRY' becomes obvious when context is taken into account.

Instead of thinking about a DNA 'blueprint' encoding an organism, it's easier to think of a CD

encoding music. Biological development is like a CD that contains instructions for building a new CD-player. You can't 'read' those instructions without already having one. If meaning does not depend upon context, then the code on the CD should have an invariant meaning, one that is independent of the player. Does it, though?

Compare two extremes: a 'standard' player that maps the digital code on the CD to music in the manner intended by the design engineers, and a jukebox. With a normal jukebox, the only message that you send is some money and a button-push; yet in the context of the jukebox these are interpreted as a specific several minutes' worth of music. In principle, any numerical code can 'mean' any piece of music you wish; it just depends on how the jukebox is set up, that is, on the exformation associated with the jukebox's design. Now consider a jukebox that reacts to a CD not by playing the tune that's encoded on it, as a series of bits, but by interpreting that code as a number, and then playing some other CD to which that number has been assigned. For instance, suppose that a recording of Beethoven's Fifth Symphony starts, in digital form, with

11001. That's the number 25 in binary. So the jukebox reads the CD as '25', and looks for CD

number 25, which we'll assume is a recording of Charlie Parker playing jazz. On the other hand, elsewhere in the jukebox is CD number 973, which actually is Beethoven's Fifth Symphony.

Then a CD of Beethoven's Fifth can be 'read' in two totally different ways: as a 'pointer' to Charlie Parker, or as Beethoven's Fifth Symphony itself (triggered by whichever CDs start with

973 in binary). Two contexts, two interpretations, two meanings, two results.

Whether something is a message depends upon context, too: sender and receiver must agree upon a protocol for turning meanings into symbols and back again. Without this protocol a semaphore is just a few bits of wood that flap about. Tree branches are bits of wood that flap about, too, but no one ever tries to decode the message being transmitted by a tree. Tree rings the growth rings that appear when you saw through the trunk, one ring per year -are a different matter. We have learned to 'decode' their 'message', about climate in the year 1066 and the like.

A thick ring indicates a good year with lots of growth on the tree, probably warm and wet; a thin ring indicates a poor year, probably cold and dry. But the sequence of tree rings only became a message, only conveyed information, when we figured out the rules that link climate to tree growth. The tree didn't send its message to us.

In biological development the protocol that gives meaning to the DNA message is the laws of physics and chemistry. That is where the exformation resides. However, it is unlikely that exformation can be quantified. An organism's complexity is not determined by the number of bases in its DNA sequence, but by the complexity of the actions initiated by those bases within the context of biological development. That is, by the meaning of the DNA 'message' when it is received by a finely tuned, up-and-running biochemical machine. This is where we gain an edge over those amoebas. Starting with an embryo that develops little flaps, and making a baby with those exquisite little hands, involves a series of processes that produce skeleton, muscles, skin, and so on. Each stage depends on the current state of the others, and all of them depend on contextual physical, biological, chemical and cultural processes.

A central concept in Shannon's information theory is something that he called entropy, which in this context is a measure of how statistical patterns in a source of messages affect the amount of information that the messages can convey. If certain patterns of bits are more likely than others, then their presence conveys less information, because the uncertainty is reduced by a smaller amount. In English, for example, the letter 'E' is much more common than the letter 'Q'. So receiving an 'E' tells you less than receiving a 'Q'. Given a choice between 'E' and 'Q', your best bet is that you're going to receive an 'E'*. And you learn the most when your expectations are proved wrong. Shannon's entropy smooths out these statistical biases and provides a 'fair' measure of information content.

In retrospect, it was a pity that he used the name 'entropy', because there is a longstanding concept in physics with the same name, normally interpreted as 'disorder'. Its opposite, 'order', is usually identified with complexity. The context here is the branch of physics known as thermodynamics, which is a specific simplified model of a gas. In thermodynamics, the molecules of a gas are modelled as 'hard spheres', tiny billiard balls. Occasionally balls collide, and when they do, they bounce off each other as if they are perfectly elastic. The Laws of Thermodynamics state that a large collection of such spheres will obey certain statistical regularities. In such a system, there are two forms of energy: mechanical energy and heat energy.

The First Law states that the total energy of the system never changes. Heat energy can be transformed into mechanical energy, as it is in, say, a steam engine; conversely, mechanical energy can be transformed into heat. But the sum of the two is always the same. The Second Law states, in more precise terms (which we explain in a moment), that heat cannot be transferred from a cool body to a hotter one. And the Third Law states that there is a specific temperature below which the gas cannot go - 'absolute zero', which is around -273 degrees Celsius.

The most difficult -and the most interesting -of these laws is the Second. In more detail, it involves a quantity that is again called 'entropy', which is usually interpreted as 'disorder'. If the gas in a room is concentrated in one corner, for instance, this is a more ordered (that is, less disordered!) state than one in which it is distributed uniformly throughout the room. So when the gas is uniformly distributed, its entropy is higher than when it is all in one corner. One formulation of the Second Law is that the amount of entropy in the universe always increases as time passes. Another way to say this is that the universe always becomes less ordered, or equivalently less complex, as time passes. According to this interpretation, the highly complex world of living creatures will inevitably become less complex, until the universe eventually runs out of steam and turns into a thin, lukewarm soup.

This property gives rise to one explanation for the 'arrow of time', the curious fact that it is easy to scramble an egg but impossible to unscramble one. Time flows in the direction of increasing entropy. So scrambling an egg makes the egg more disordered -that is, increases its entropy which is in accordance with the Second Law. Unscrambling the egg makes it less disordered, and decreases energy, which conflicts with the Second Law. An egg is not a gas, mind you, but thermodynamics can be extended to solids and liquids, too.

At this point we encounter one of the big paradoxes of physics, a source of considerable confusion for a century or so. A different set of physical laws, Newton's laws of motion, predicts that scrambling an egg and unscrambling it are equally plausible physical events. More precisely, if any dynamic behaviour that is consistent with Newton's laws is run backwards in time, then the result is also consistent with Newton's laws. In short, Newton's laws are 'time-reversible'.

However, a thermodynamic gas is really just a mechanical system built from lots of tiny spheres.

In this model, heat energy is just a special type of mechanical energy, in which the spheres vibrate but do not move en masse. So we can compare Newton's laws with the laws of thermodynamics. The First Law of Thermodynamics is simply a restatement of energy conservation in Newtonian mechanics, so the First Law does not contradict Newton's laws.

Neither does the Third Law: absolute zero is just the temperature at which the spheres cease vibrating. The amount of vibration can never be less than zero.

Unfortunately, the Second Law of Thermodynamics behaves very differently. It contradicts Newton's laws. Specifically, it contradicts the property of time-reversibility. Our universe has a definite direction for its 'arrow of time', but a universe obeying Newton's laws has two distinct arrows of time, one the opposite of the other. In our universe, scrambling eggs is easy and unscrambling them seems impossible. Therefore, according to Newton's laws, in a time-reversal of our universe, unscrambling eggs is easy but scrambling them is impossible. But Newton's laws are the same in both universes, so they cannot prescribe a definite arrow of time.

Many suggestions have been made to resolve this discrepancy. The best mathematical one is that thermodynamics is an approximation, involving a 'coarse-graining' of the universe in which details on very fine scales are smeared out and ignored. In effect, the universe is divided into tiny boxes, each containing (say) several thousand gas molecules. The detailed motion inside such a box is ignored, and only the average state of its molecules is considered.

It's a bit like a picture on a computer screen. If you look at it from a distance, you can see cows and trees and all kinds of structure. But if you look sufficiently closely at a tree, all you see is one uniformly green square, or pixel. A real tree would still have detailed structure at this scale leaves and twigs, say -but in the picture all this detail is smeared out into the same shade of green.

In this approximation, once 'order' has disappeared below the level of the coarse-graining, it can never come back. Once a pixel has been smeared, you can't unsmear it. In the real universe, though, it sometimes can, because in the real universe the detailed motion inside the boxes is still going on, and a smeared-out average ignores that detail. So the model and the reality are different. Moreover, this modelling assumption treats forward and backward time asymmetrically. In forward time, once a molecule goes into a box, it can't escape. In contrast, in a time-reversal of this model it can escape from a box but it can never get in if it wasn't already inside that box to begin with.

This explanation makes it clear that the Second Law of Thermodynamics is not a genuine property of the universe, but merely a property of an approximate mathematical description.

Whether the approximation is helpful or not thus depends on the context in which it is invoked, not on the content of the Second Law of Thermodynamics. And the approximation involved destroys any relation with Newton's laws, which are inextricably linked to that fine detail. Now, as we said, Shannon used the same word 'entropy' for his measure of the structure introduced by statistical patterns in an information source. He did so because the mathematical formula for Shannon's entropy looks exactly the same as the formula for the thermodynamic concept. Except for a minus sign. So thermodynamic entropy looks like negative Shannon entropy: that is, thermodynamic entropy can be interpreted as 'missing information'. Many papers and books have been written exploiting this relationship -attributing the arrow of time to a gradual loss of information from the universe, for instance. After all, when you replace all that fine detail inside a box by a smeared-out average, you lose information about the fine detail. And once it's lost, you can't get it back. Bingo: time flows in the direction of information-loss.

However, the proposed relationship here is bogus. Yes, the formulas look the same ... but they apply in very different, unrelated, contexts. In Einstein's famous formula relating mass and energy, the symbol c represents the speed of light. In Pythagoras's Theorem, the same letter represents one side of a right triangle. The letters are the same, but nobody expects to get sensible conclusions by identifying one side of a right triangle with the speed of light. The alleged relationship between thermodynamic entropy and negative information isn't quite that silly, of course. Not quite.

As we've said, science is not a fixed body of 'facts', and there are disagreements. The relation between Shannon's entropy and thermodynamic entropy is one of them. Whether it is meaningful to view thermodynamic entropy as negative information has been a controversial issue for many years. The scientific disagreements rumble on, even today, and published, peer-reviewed papers by competent scientists flatly contradict each other.

What seems to have happened here is a confusion between a formal mathematical setting in which 'laws' of information and entropy can be stated, a series of physical intuitions about heuristic interpretations of those concepts, and a failure to understand the role of context. Much is made of the resemblance between the formulas for entropy in information theory and thermodynamics, but little attention is paid to the context in which those formulas apply. This habit has led to some very sloppy thinking about some important issues in physics.

One important difference is that in thermodynamics, entropy is a quantity associated with a state of the gas, whereas in information theory it is defined for an information source: a system that generates entire collections of states ('messages'). Roughly speaking, a source is a phase space for successive bits of a message, and a message is a trajectory, a path, in that phase space. In contrast, a thermodynamic configuration of molecules is a point in phase space. A specific configuration of gas molecules has a thermodynamic entropy, but a specific message does not have a Shannon entropy. This fact alone should serve as a warning. And even in information theory, the information 'in' a message is not negative information-theoretic entropy. Indeed the entropy of the source remains unchanged, no matter how many messages it generates.

There is another puzzle associated with entropy in our universe. Astronomical observations do not fit well with the Second Law. On cosmological scales, our universe seems to have become more complex with the passage of time, not less complex. The matter in the universe started out in the Big Bang with a very smooth distribution, and has become more and more clumpy -more and more complex -with the passage of time. The entropy of the universe seems to have decreased considerably, not increased. Matter is now segregated on a huge range of scales: into rocks, asteroids, planets, stars, galaxies, galactic clusters, galactic superclusters and so on. Using the same metaphor as in thermodynamics, the distribution of matter in the universe seems to be becoming increasingly ordered. This is puzzling since the Second Law tells us that a thermodynamic system should become increasingly disordered.

The cause of this clumping seems to be well established: it is gravity. A second time-reversibility paradox now rears its head. Einstein's field equations for gravitational systems are time- reversible. This means that if any solution of Einstein's field equations is time-reversed, it becomes an equally valid solution. Our own universe, run backwards in this manner, becomes a gravitational system that gets less and less clumpy as time passes -so getting less clumpy is just as valid, physically, as getting more clumpy. Our universe, though, does only one of these things: more clumpy.

Paul Davies's view here is that 'as with all arrows of time, there is a puzzle about where the asymmetry comes in ... The asymmetry must therefore be traced to initial conditions'. What he means here is that even with time-reversible laws, you can get different behaviour by starting the system in a different way. If you start with an egg and stir it with a fork, then it scrambles. If you start with the scrambled egg, and very very carefully give each tiny particle of egg exactly the right push along precisely the opposite trajectory, then it will unscramble. The difference lies entirely in the initial state, not in the laws. Notice that 'stir with a fork' is a very general kind of initial condition: lots of different ways to stir will scramble the egg. In contrast, the initial condition for unscrambling an egg is extremely delicate and special.

In a way this is an attractive option. Our clumping universe is like an unscrambling egg: its increasing complexity is a consequence of very special initial conditions. Most 'ordinary' initial conditions would lead to a universe that isn't clumped -just as any reasonable kind of stirring leads to a scrambled egg. And observations strongly suggest that the universe's initial conditions at the time of the Big Bang were extremely smooth, whereas any 'ordinary' state of a gravitational system presumably should be clumped. So, in agreement with the suggestion just outlined, it seems that the initial conditions of the universe must have been very special -an attractive proposition for those who believe that our universe is highly unusual, and ditto for our place within it.

From the Second Law to God in one easy step.

Roger Penrose has even quantified how special this initial state is, by comparing the thermodynamic entropy of the initial state with that of a hypothetical but plausible final state in which the universe has become a system of Black Holes. This final state shows an extreme degree of dumpiness - though not the ultimate degree, which would be a single giant Black Hole.

The result is that the entropy of the initial state is about l030 times that of the hypothetical final state, making it extremely special. So special, in fact, that Penrose was led to introduce a new time-asymmetric law that forces the early universe to be exceptionally smooth.

Oh, how our stories mislead us ... There is another, much more reasonable, explanation. The key point is simple: gravitation is very different from thermodynamics. In a gas of buzzing molecules, the uniform state -equal density everywhere -is stable. Confine all the gas into one small part of a room, let it go, and within a split second it's back to a uniform state. Gravity is exactly the opposite: uniform systems of gravitating bodies are unstable. Differences smaller than any specific level of coarse-graining not only can 'bubble up' into macroscopic differences as time passes, but do.

Here lies the big difference between gravity and thermodynamics. The thermodynamic model that best fits our universe is one in which differences dissipate by disappearing below the level of coarse-graining as time marches forwards. The gravitic model that best fits our universe is one in which differences amplify by bubbling up from below the level of coarse-graining as time marches forwards. The relation of these two scientific domains to coarse-graining is exactly opposite when the same arrow of time is used for both.

We can now give a completely different, and far more reasonable, explanation for the 'entropy gap' between the early and late universes, as observed by Penrose and credited by him to astonishingly unlikely initial conditions. It is actually an artefact of coarse-graining.

Gravitational clumping bubbles up from a level of coarse-graining to which thermodynamic entropy is, by definition, insensitive. Therefore virtually any initial distribution of matter in the universe would lead to clumping. There's no need for something extraordinarily special.

The physical differences between gravitating systems and thermodynamic ones are straightforward: gravity is a long-range attractive force, whereas elastic collisions are short-range and repulsive. With such different force laws, it is hardly surprising that the behaviour should be so different. As an extreme case, imagine systems where gravity' is so short range that it has no effect unless particles collide, but then they stick together forever. Increasing dumpiness is obvious for such a force law.

The real universe is both gravitational and thermodynamic. In some contexts, the thermodynamic model is more appropriate and thermodynamics provides a good model. In other contexts, a gravitational model is more appropriate. There are yet other contexts: molecular chemistry involves different types of forces again. It is a mistake to shoehorn all natural phenomena into the thermodynamic approximation or the gravitic approximation. It is especially dubious to expect both thermodynamic and gravitic approximations to work in the same context, when the way they respond to coarse-graining is diametrically opposite.

See? It's simple. Not magical at all ...

Perhaps it's a good idea to sum up our thinking here.

The 'laws' of thermodynamics, especially the celebrated Second Law, are statistically valid models of nature in a particular set of contexts. They are not universally valid truths about the universe, as the clumping of gravity demonstrates. It even seems plausible that a suitable measure of gravitational complexity, like thermodynamic entropy but different, might one day be defined -call it 'gravtropy', say. Then we might be able to deduce, mathematically, a 'second law of gravities', stating that the gravtropy of a gravitic system increases with time. For example, gravtropy might perhaps be the fractal dimension ('degree of intricacy') of the system.

Even though coarse-graining works in opposite ways for these two types of system, both 'second laws' -thermodynamic and gravitic -would correspond rather well to our own universe. The reason is that both laws are formulated to correspond to what we actually observe in our own universe. Nevertheless, despite this apparent concurrence, the two laws would apply to drastically different physical systems: one to gases, the other to systems of particles moving under gravity.

With these two examples of the misuse of information-theoretic and associated thermodynamic principles behind us, we can turn to the intriguing suggestion that the universe is made from information.

Ridcully suspected that Ponder Stibbons would invoke 'quantum' to explain anything really bizarre, like the disappearance of the Shell Midden People. The quantum world is bizarre, and this kind of invocation is always tempting. In an attempt to make sense of the quantum universe, several physicists have suggested founding all quantum phenomena (that is, everything) on the concept of information. John Archibald Wheeler coined the phrase 'It from Bit' to capture this idea. Briefly, every quantum object is characterised by a finite number of states. The spin of an electron, for instance, can either be up or down, a binary choice. The state of the universe is therefore a huge list of ups and downs and more sophisticated quantities of the same general kind: a very long binary message.

So far, this is a clever and (it turns out) useful way to formalise the mathematics of the quantum world. The next step is more controversial. All that really matters is that message, that list of bits.

And what is a message? Information. Conclusion: the real stuff of the universe is raw information. Everything else is made from it according to quantum principles. Ponder would approve.

Information thereby takes its place in a small pantheon of similar concepts -velocity, energy, momentum -that have made the transition from convenient mathematical fiction to reality.

Physicists like to convert their technically most useful mathematical concepts into real things: like Discworld, they reify the abstract. It does no physical harm to 'project' the mathematics back into the universe like this, but it may do philosophical harm if you take the result literally.

Thanks to a similar process, for example, entirely sane physicists today insist that our universe is merely one of trillions that coexist in a quantum superposition. In one of them you left your house this morning and were hit by a meteorite; in the one in which you're reading this book, that didn't happen. 'Oh, yes,' they urge: 'those other universes really do exist. We can do experiments to prove it.'

Not so.

Consistency with an experimental result is not a proof, not even a demonstration, that an explanation is valid. The 'many-worlds' concept, as it is called, is an interpretation of the experiments, within its own framework. But any experiment has many interpretations, not all of which can be 'how the universe really does it'. For example, all experiments can be interpreted as

'God made that happen', but those selfsame physicists would reject their experiment as a proof of the existence of God. In that they are correct: it's just one interpretation. But then, so are a trillion coexisting universes.

Quantum states do superpose. Quantum universes can also superpose. But separating them out into classical worlds in which real-life people do real-life things, and saying that those superpose, is nonsense. There isn't a quantum physicist anywhere in the world that can write down the quantum-mechanical description of a person. How, then, can they claim that their experiment

(usually done with a couple of electrons or photons) 'proves' that an alternate you was hit by a meteorite in another universe?

'Information' began its existence as a human construct, a concept that described certain processes in communication. This was 'bit from it', the abstraction of a metaphor from reality, rather than 'it from bit', the reconstruction of reality from the metaphor. The metaphor of information has since been extended far beyond its original bounds, often unwisely. Reifying information into the basic substance of the universe is probably even more unwise. Mathematically, it probably does no harm, but Reification Can Damage Your Philosophy.