The Philosophy of Physics
Simon Saunders
‘Physics, and physics alone, has complete coverage’, according to Quine. Philosophers of physics will mostly agree. But there is less consensus among physicists, many of whom have a sneaking regard for philosophical questions, about the use of the word ‘reality’ for example.
Why be mealy-mouthed when it comes to what is real? The answer lies in quantum mechanics. Very little happens in physics these days without quantum mechanics having its say: never has a theory been so prolific in predicting new and astounding effects, with so vast a scope. But for all its uncanny fecundity, there is a certain difficulty. After a century of debate, there is very little agreement on how this difficulty should be resolved - indeed, what consensus there was on it has slowly evaporated. The crucial point of contention concerns the interface between macro and micro. Since experiments on the micro-world involve measurements, and measurements involve observable changes in the instrumentation, it is unsurprising how the difficulty found its name: the problem of measurement. But really it is a problem of how, and whether, the theory describes any actual events. As Werner Heisenberg put it, “it is the `factual' character of an event describable in terms of the concepts of daily life which is not without further comment contained in the mathematical formalism of quantum theory” (Heisenberg 1959, p.121).
The problem is so strange, so intractable, and so far-reaching, that, with the exception of space-time philosophy, it has come to dominate the philosophy of physics. The philosophy of space-time is the subject of a separate chapter: no apology, then, is needed, for devoting this chapter to the problem of measurement alone.
1. Orthodoxy
Quantum mechanics was virtually completed in 1926. But it only compounded - entrenched - a problem that had been obvious for years: wave-particle duality. For a simple example, consider Young's two slit experiment, in which monochromatic light, incident on two narrow, parallel slits, subsequently produces an interference pattern on a distant screen (in this case closely spaced bands of light and dark parallel to the slits). If either of the slits is closed, the pattern is lost. If one or other slit is closed sporadically, and randomly, so that only one is open at any one time, the pattern is lost.
There is no difficulty in understanding this effect on the supposition that light consists of waves; but on careful examination of low-intensity light, the interference pattern is built up, one spot after another - as if light consists of particles (photons). The pattern slowly emerges even if only one photon is in the apparatus at any one time; and yet it is lost when only one slit is open at any one time. It appears, absurdly, as if the photon must pass through both slits, and interfere with itself. As Richard Feynman observed in his Lectures on Physics, this is ‘a phenomenon which is impossible, absolutely impossible, to explain in any classical way, and which has in it the heart of quantum mechanics. In reality, it contains the only mystery.’
Albert Einstein, in 1905, was the first to argue for this dual nature to light; Niels Bohr, in 1924, was the last to accept it. For Einstein the equations discovered by Heisenberg and Erwin Schrödinger did nothing to make it more understandable. On this point, indeed, he and Bohr were in agreement (Bohr was interested in understanding experiments, rather than equations); but Bohr, unlike Einstein, was prepared to see in the wave-particle duality not a puzzle to be solved but a limitation to be lived with, forced upon us by the very existence of the `quantum of action' (resulting from Planck's constant h, defining in certain circumstances a minimal unit of action); what Bohr also called the quantum postulate. The implication, he thought, was that a certain `ideal of explanation' had to be given up, not that classical concepts were inadequate or incomplete or that new concepts were needed. This ideal was the independence of a phenomenon of the means by which it is observed.
With this ideal abandoned, the experimental context must enter into the very definition of a phenomenon. But that meant classical concepts enter essentially too, if only because the apparatus must be classically describable. In fact, Bohr held the more radical view that these were the only real concepts available (they were unrevizable; in his later writings, they were a condition on communicability, on the very use of ordinary language).
Less obviously, the quantum postulate also implied limitations on the `mutual definability' of classical concepts. But therein lay the key to what Bohr called the `generalization' of classical mechanics: certain classical concepts, like `space-time description', `causation', `particle', `wave', if given operational meaning in a given experimental context, excluded the use of others. Thus the momentum and position of a system could not both, in a single experimental context, be given a precise meaning: momentum in the range Δp, and position in range Δx, must satisfy the inequality ΔpΔx≥h (an example of the Heisenberg uncertainty relations).
As a result, phenomena were to be described and explained, in a given context, using only a subset of the total set of classical concepts normally available - and to neither require or permit of any dovetailing with those in another, mutually exclusive experimental context, using a different subset of concepts. That in fact is how genuine novelty was to arise, according to Bohr, despite the unrevizability of classical concepts: thus light behaved like a wave in one context, like a particle in another, without contradiction.
Concepts standing in this exclusionary relation he called complementary. Bohr's great success was that he could show that indeed complementary concepts, at least those that could be codified in uncertainty relationships, could not be operationally defined in a single experimental context. Thus, in the case of the two-slit experiment, any attempt to determine which slit the photon passes through (say by measuring the recoil, hence momentum, of the slit) leads to an uncertainty in its position sufficient to destroy the interference pattern. These were the highly publicized debates over foundations that Bohr held with Einstein, in the critical years just after the discovery of the new equations; Bohr won them all.
Bohr looked to the phenomena, not to the equations, surely a selling point of his interpretation in the 1920s: the new formalism was after all mathematically challenging. When he first presented his philosophy of complementarity, at the Como lecture of 1927, he made clear it was based on ‘the general trend of the development of the theory from its very beginning’ (Bohr 1934 p.52) - a reference to the so-called old quantum theory, rather than to the new formalism. The later he acknowledged others in the audience understood much better than he.
It is in the equations that the problem of measurement is most starkly seen. The state ψ in non-relativistic quantum mechanics is a function on the configuration space of a system (or one isomorphic to it, like momentum space). A point in this space specifies the positions of all the particles comprising a system at each instant of time (respectively, their momenta). This function must be square-integrable, and is normalized so that its integral over configuration space (momentum space) is one. Its time development is determined by the Schrödinger equation, which is linear - meaning, if ψ₁(t), ψ₂(t) are solutions, then so is c₁ψ₁(t)+c₂ψ₂(t), for arbitrary complex numbers c₁, c₂.
Now for the killer question. In many cases the linear (1:1 and norm-preserving, hence unitary) evolution of each state ψk admits of a perfectly respectable, deterministic and indeed classical (or at least approximately classical) description, of a kind that can be verified and is largely uncontentious. Thus the system in state ψ₁, having passed through a semi-reflecting mirror, reliably triggers a detector. The system in state ψ₂, having been reflected by the mirror, reliably passes it by. But by linearity if ψ₁ and ψ₂ are solutions to the Schrödinger equation, so is c₁ψ₁+c₂ψ₂. What happens then?
The orthodox answer to this question is given by the measurement postulate: that in a situation like this, the state c₁ψ₁+c₂ψ₂ only exists prior to measurement. When the apparatus couples to the system, on measurement, the detector either fires or it doesn't, with probability ‖c₁‖² and ‖c₂‖² respectively. Indeed, as is often the case, when the measurement is repeatable - over sufficiently short times, the same measurement can be performed on the same system yielding the same outcome - the state must have changed on the first experiment, from the initial superposition, c₁ψ₁+c₂ψ₂, to either the state ψ₁, or to the state ψ₂ (in which it thereafter persists on repeated measurements). This transition is in contradiction with the unitary evolution of the state, prior to measurement. It is wave-packet reduction (WPR).
What has this to do with the wave-particle duality? Just this: let the state of the photon as it is incident on the screen on the far side of the slits be written as the superposition c₁ψ₁+c₂ψ₂+c₃ψ₃+....+cnψn, where ψk is the state in which the photon is localized in the kth region of the screen. Then by the measurement postulate, and supposing it is ‘photon position’ which is measured by exposing and processing a photographic emulsion, the photon is measured to be in region k with probability ‖ck‖². In this way the ‘wave’ (the superposition, the wave extended over the whole screen) is converted to the ‘particle’ (a localized spot on the screen). The appearance of a localized spot (and the disappearance of the wave everywhere else across the screen) is WPR.
Might WPR (and in particular the apparent conflict between it and the unitary evolution prior to measurement) be a consequence of the fact that the measurement apparatus has not itself been included in the dynamical description? Then model the apparatus explicitly, if only in the most schematic and idealized terms. Suppose as before (as we require of a good measuring device) that the (unitary) dynamics is such that if the microscopic system is initially in the state ψk, then the state of the joint system (microscopic system together with the apparatus) after the measurement is reliably Ψk (with the apparatus showing ‘the kth-outcome recorded'). It now follows from linearity that if one has initially the superposition c₁ψ₁+c₂ψ₂+..., one obtains after measurement (by nothing but unitarity) the final state c₁Ψ₁+c₂Ψ₂+..., and nothing has been gained.
Should one then model the human observer as well? It is a fool's errand. The `chain of observation' has to stop somewhere - by applying the measurement postulate, not by modeling further details of the measuring process explicitly, or the observers as physical systems themselves.
These observations were first made in detail, and with great rigor, by the mathematician John von Neumann in his Mathematical Foundations of Quantum Mechanics in 1932. They were also made informally by Erwin Schrödinger, by means of a famous thought experiment, in which a cat is treated as a physical system, and modeled explicitly, as developing into a superposition of two macroscopic outcomes. It was upsetting (and not only to cat-lovers) to consider the situation when detection of ψ₁ reliably causes not only a Geiger counter to fire but the release of a poison that causes the death of the cat, described by Ψ₁. We, performing the experiment (if quantum mechanics is to believed), will produce a superposition of a live and dead cat of the form c₁Ψ₁+c₂Ψ₂. Is it only when we go on to observe which it is that we should apply the measurement postulate, and conclude it is dead (with probability ‖c₁‖²) or alive (with probability ‖c₂‖²)? Or has the cat got there before us, and already settled the question? As Einstein inquired, ‘Is the Moon there when nobody looks?’. If so, then the state c₁Ψ₁+c₂Ψ₂ is simply a wrong or (at best) an incomplete description of the cat and the decaying atom, prior to observation.
The implication is obvious: why not look for a more detailed level of description? But von Neumann and Schrödinger hinted at the idea that a limitation like this was inevitable; that WPR was an expression of a certain limit to physical science; that it somehow brokered the link between the objective and the subjective aspects of science, between the object of knowledge, and the knowing subject; that…. Writings on this score trod a fine line between science and mysticism - or idealism.
Thus John Wheeler's summary, that reads like Berklerian idealism: “In today's words Bohr's point --- and the central point of quantum theory --- can be put into a single, simple sentence. ‘No elementary phenomenon is a phenomenon until it is a registered (observed) phenomenon.’” And Heisenberg’s: the ‘factual element’ missing from the formalism “appears in the Copenhagen [orthodox] interpretation by the introduction of the observer”. ‘The observer’ was already a ubiquitous term in writings on relativity, but there it could be replaced by ‘inertial frame’, meaning a concrete system of rods and clocks: no such easy translation was available in quantum mechanics.
Einstein had a simpler explanation. The quantum mechanical state is an incomplete description. WPR is purely epistemic - the consequence of learning something new. His argument (devised with Boris Podolsky and Nathan Rosen) was independent of micro-macro correlations, resting rather on correlations between distant systems: they too could be engineered so as to occur in a superposition. Thus Ψk might describe a particle A in state ψk correlated with particle B in state φk, where A and B are spatially remote from one another. In that case the observation that A is in state ψk would imply that B is in state φk - and one will learn this (with probability ‖ck‖²) by applying the measurement postulate to the total system, as given by the state c₁Ψ₁+c₂Ψ₂, on the basis only of measurements on A. How can B ‘acquire’ a definite state (either φ₁ or φ₂) on the basis of the observation of the distant particle A? - and correspondingly, how can the probabilities of certain outcomes on measurements of B be changed? The implication, if there is to be no spooky `action-at-a-distance', is that B was already in one or the other states φ₁ or φ₂ - in which case the initial description of the composite system c₁Ψ₁+c₂Ψ₂ was simply wrong, or at best incomplete. This is the famous EPR argument.
It was through investigation of the statistical nature of such correlations in the 1960s and '70s that foundational questions re-entered the mainstream of physics. They were posed by the physicist John Bell, in terms of a theory - any theory - that gives additional information about the systems A, B, over and above that defined by the quantum mechanical state. He found that if such additional values to physical quantities (`hidden variables') are local - unchanged by remote experiments - then their averages (that one might hope will yield the quantum mechanically predicted statistics) must satisfy a certain inequality. Schematically:
Hidden Variables + Locality (+background assumptions)⇒Bell inequality.
But experiment, and the quantum mechanical predictions, went against the Bell inequality. Experiment thus went against Einstein: if there is to be a hidden level of description, not provided by the quantum mechanical state, and satisfying very general background assumptions, it will have to be non-local.
But is this argument from non-locality, following on from Bell's work, really an argument against hidden variables? Not if quantum mechanics is already judged non-local, as it appears, assuming the completeness of the state, and making use of the measurement postulate. Bohr's reply to EPR in effect accepted this point: once the type of experiment performed remotely is changed, and some outcome obtained, so too does the state for a local event change; so too do the probabilities for local outcomes change. So, were single-case probabilities measurable, one would be able to signal superluminally (but of course neither they nor the state is directly measurable). Whether or not there are hidden variables, it seems, there is non-locality.
2. Pilot-wave theory
The climate, by the mid 1980s, was altogether transformed. Not only had questions of realism and non-locality been subject to experimental tests, but it was realized – again, largely due to Bell’s writings, newly anthologized as Speakable and Unspeakable in QuantumMechanics - that something was amiss with Bohr's arguments for complementarity. For a detailed solution to the problem of measurement - incorporating, admittedly, a form of non-locality - was now clearly on the table, demonstrably equivalent to standard quantum mechanics.
It is the pilot-wave theory (also called Bohmian mechanics). It is explicitly dualistic: the wave function must satisfy Schrödinger's equation, as in the conventional theory, but is taken as a physical field, albeit one that is defined on configuration space E3N (where N is the number of particles); and in addition there is a unique trajectoryin E3N - specifying, instant by instant, the configuration of all the particles, as determined by the wave function.
Any complex-valued function ψ on a space can be written as ψ=Aexp iS, where A and S are real-valued functions on that space. In the simplest case of a single particle (N=1) configuration space is ordinary Euclidean space E3. Let ψ(x,t) satisfy the Schrödinger equation; the new postulate is that a particle of mass m at the point x at time t must have the velocity:
v(x,t)=(h/m)∇S(x,t)
(the guidance equation). If, furthermore, the probability density ρ(x,t₀) on the configuration space of the particle at time t₀ is given by
ρ(x,t₀)=A²(x,t₀)
(the Born rule) - that is, ρ(x,t₀)ΔV is the probability of finding the particle in volume ΔV about the point x at time t₀ - then the probability of finding it in the region ΔV′ to which ΔV is mapped by the guidance equation at time t, will be the same,
ρ′(x′,t)ΔV′=ρ(x,t₀)ΔV.
What does `probability' really mean here? Never mind: that is a can of worms in any deterministic theory. Let us say it means whatever probability means in classical statistical mechanics, which is likewise deterministic. Thus conclude: the probability of a region of configuration space, as given by the Born rule, is preserved under the flow of the velocity field.
It is a humble-enough claim, but it secures the empirical equivalence of the theory with the standard formalism, equipped with the measurement postulate, so long as particle positions are all that are directly measured. And it solves the measurement problem: nothing particularly special occurs on measurement. Rather, one simply discovers what is there, the particle positions at the instant they are observed.