Confusion about the thermal interpretation's account of measurement

nicf · Jun 27, 2019

I'm a mathematician with a longstanding interest in physics, and I've recently been enjoying reading and thinking about Arnold Neumaier's thermal interpretation, including some threads on this forum. There's something that's still confusing me, though, and I'm hoping someone here can clear it up. Most of the questions here come from the third paper in the series.

Consider some experiment, like measuring the spin of a suitably prepared electron, where we can get one of two outcomes. The story usually goes that, before the electron sets off the detector, the state is something like ##\left[\sqrt{\frac12}(|\uparrow_e\rangle+|\downarrow_e\rangle)\right]\otimes|\mbox{ready}\rangle##, where ##|\uparrow_e\rangle## denotes the state of an electron which has spin up around the relevant axis, and afterwards the state is something like ##\sqrt{\frac12}(|\uparrow_e\rangle\otimes |\uparrow_D\rangle+|\downarrow_e\rangle\otimes|\downarrow_D\rangle)##, where ##|\uparrow_D\rangle## denotes a state in which the (macroscopic) detector has reacted the way it would have if the electron had started in the state ##|\uparrow_e\rangle##. It's usually argued that this macroscopic superposition has to arise because the Schrödinger equation is linear. Let's call this the first story.

This description has struck many people (including me) as confusing, since it seems to contradict what I actually see when I run the experiment: if I see the "up" result on my detector, then the "down" term above doesn't seem to have anything to do with the world I see in front of me. It's always seemed to me that this apparent contradiction is the core of the "measurement problem" and, to me at least, resolving it is the central reason to care about interpretations of quantum mechanics.

Neumaier seems to say that the first story is simply incorrect. Instead he tells what I'll call the second story: because the detector is in a hot, noisy, non-at-all-isolated environment, and I only care about a very small number of the relevant degrees of freedom, I should instead represent it as a reduced density matrix and, since I've chosen to ignore most of the physical degrees of freedom in this system, the detector's position evolves in some complicated nonlinear way, but with the two possible readings as the only (relevant) stable states of the system. Which result actually happens depends on details of the state of the detector and the environment which aren't practically knowable, but the whole process is, in principle, deterministic. But the macroscopic superposition from the first story doesn't actually ever obtain, or if it does it quickly evolves into one of the two stable states.

So, finally, here's what I'd like to understand better:

(0) Did I describe the second story correctly?

(1) It seems to me that the second story could be told entirely within what Neumaier calls the "formal core" of quantum mechanics, the part that every interpretation agrees on. In his language, after my experiment, the q-probability distribution of the location of the detector needle really is supported only in the "up" region, and this follows from ordinary, uncontroversial quantum mechanics. Is this right? Does anything about the second story actually depend on the thermal interpretation?

(2) A more philosophical question: If macroscopic superpositions never actually appear, why all the fuss about interpretations? (For example, the many worlds interpretation seems to exist entirely to describe what it would mean for the universe to end up in such a macroscopic superposition.) What else even is there to worry about? If this does resolve the measurement problem, why wasn't it pointed out a long time ago?

(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?

A. Neumaier · Jun 29, 2019

nicf said:

(0) Did I describe the second story correctly?

Yes, if (in case of the qubit discussed in Part IV of my series of papers) 'the system' exclusively refers to the reduced 2-state system and not to any other property of the detector.

nicf said:

(1) It seems to me that the second story could be told entirely within what Neumaier calls the "formal core" of quantum mechanics, the part that every interpretation agrees on. In his language, after my experiment, the q-probability distribution of the location of the detector needle really is supported only in the "up" region, and this follows from ordinary, uncontroversial quantum mechanics. Is this right? Does anything about the second story actually depend on the thermal interpretation?

As long as one only looks at an ensemble of similarly prepared systems, nothing depends on the thermal interpretation. But the thermal interpretation explains what happens in each individual case, and why.

nicf said:

(2) A more philosophical question: If macroscopic superpositions never actually appear, why all the fuss about interpretations? (For example, the many worlds interpretation seems to exist entirely to describe what it would mean for the universe to end up in such a macroscopic superposition.) What else even is there to worry about? If this does resolve the measurement problem, why wasn't it pointed out a long time ago?

That macroscopic systems can in principle be described by a pure state is a prerequisite of the traditional discussions, and is part of almost all interpretaions in print. The thermal interpretation explicitly negates this assumption.

nicf said:

(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?

Decoherence cannot solve the measurement problem since it still assumes the eigenvalue link to measurement and hence has no explanation for unique outcomes. The thermal interpretation has unique outcomes built in from the outset, hence only has to explain the origin of the probabilities.

nicf · Jul 1, 2019

Thanks for taking the time to reply! I have a couple more questions, but what you've said so far is helpful.

A. Neumaier said:

Yes, if (in case of the qubit discussed in Part IV of my series of papers) 'the system' exclusively refers to the reduced 2-state system and not to any other property of the detector.

Yes, that's what I meant --- I'm referring to the variable that encodes which of the two readings ends up being displayed on the detector, and this omits the vast majority of the physical properties of the detector and indeed the rest of the universe. I think we're on the same page.

A. Neumaier said:

As long as one only looks at an ensemble of similarly prepared systems, nothing depends on the thermal interpretation. But the thermal interpretation explains what happens in each individual case, and why.

I think I understand what you're saying here, but I'm asking because I thought that, in addition, you were actually claiming something even stronger: that the "first story" fails on its own terms. That is, I read you as saying that the problem arises from describing the detector as a pure state, which forces you into linear dynamics, which in turn forces you into the macroscopic superposition. You seem to be saying the pure-state assumption is simply a mistake no matter which interpretation you subscribe to, because the detector needle isn't isolated from its environment. Is that right?

Once one agrees that the macroscopic superposition can't happen, and that in the end the q-probability distribution of location of the needle has almost all its mass located in one of the two separated regions, it seems to me that we've already eliminated all the "mystery" that's usually associated with quantum measurements --- you now just need some way to attach a physical meaning to the mathematical objects in front of you, and I agree with you that, since the q-variance is small, it's very natural to interpret the q-expectation of the needle position variable as "where the needle is".

Part of the reason I've enjoyed reading this series of papers is that I find your explanation of measurement very attractive; it's the only story I've ever seen that I could imagine finding fully satisfying. The reason I'm confused is that I don't understand why, if the macroscopic superposition actually doesn't occur, anyone would still be proposing things like many-worlds, Bohmian mechanics, or objective collapse theories. When smart people do things that don't make sense to me, it makes me think I'm not understanding something! Are the people proposing these other interpretations are just making the mistake of trying to describe the detector with a pure state?

A. Neumaier · Jul 15, 2019

nicf said:

Once one agrees that the macroscopic superposition can't happen

Macroscopically, one has density operators, and talking about their superposition is meaningless.

nicf said:

Are the people proposing these other interpretations are just making the mistake of trying to describe the detector with a pure state?

In the standard interpretations, this is not a mistake but a principal feature!

vanhees71 · Jul 15, 2019

Well, there are some macroscopic systems which show specific quantum behavior (superfluidity of liquid Helium, super conductivity), but it's of course not so easy to prepare macroscopic systems in states such that quantum behavior is observable in macroscopic quantities. That's why classical physics indeed usually works so well for macroscopic matter. The standard interpretation of the QT formalism allows me to say that this is due to course graining, averaging over many microscopic degrees of freedom such that quantum fluctuations become irrelevant for the observed macrosocopic quantities on the scales of the typical resolution of their dynamical behavior. Within the thermal interpretaion I'm not allowed to say this anymore, but I don't know what I'm allowed to say.

It's new to me that detectors are described by "pure states". Usually it's described as a classical macroscopic device, or which particular example do you have in mind?

A. Neumaier · Jul 15, 2019

vanhees71 said:

It's new to me that detectors are described by "pure states". Usually it's described as a classical macroscopic device, or which particular example do you have in mind?

Well, surely a classical macroscopic device is also a quantum system, hence described by a quantum state. At least people who need to design detectors in silico treat the macroscopic system formed by the device as a quantum system.

Elsewhere you just said,

vanhees71 said:

Incoherent light can, e.g., be described by taking the intensity of coherent light and randomize phase differences. The same holds for polarization.

Thus you regard the mixed quantum state with rotation invariant density matrix as a randomized pure (polarized) quantum state. This exemplifies the rule, stated in all standard quantum mechanics books, that the density operator of any quantum system is regarded (by the orthodoxy supported, e.g., by Landau and Lifshitz) as representing an unknown pure state, randomized over the macroscopic uncertainty.

Michael Price · Jul 15, 2019

nicf said:

(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?

Decoherence does solve a big part the measurement problem, because it shows how wavefunctions will seem to collapse upon measurement, without actually having to postulate that they do actually collapse. The demonstration of this is just a technical matter, independent of any interpretation. What it doesn't explain is where probabilities come from - that is another story.

A. Neumaier · Jul 15, 2019

vanhees71 said:

Well, there are some macroscopic systems which show specific quantum behavior (superfluidity of liquid Helium, super conductivity), but it's of course not so easy to prepare macroscopic systems in states such that quantum behavior is observable in macroscopic quantities.

The macroscopic laws of a superfluid are as classical as the macroscopic laws of hydromechanics for water, though quantitatively slightly different. Both depend for their details (thermodynamic state functions) on quantum properties of matter. But the macroscopic limit is in both cases classical and deterministic.

vanhees71 said:

That's why classical physics indeed usually works so well for macroscopic matter. The standard interpretation of the QT formalism allows me to say that this is due to course graining, averaging over many microscopic degrees of freedom such that quantum fluctuations become irrelevant for the observed macrosocopic quantities on the scales of the typical resolution of their dynamical behavior. Within the thermal interpretation I'm not allowed to say this anymore, but I don't know what I'm allowed to say.

The thermal interpretation also explains this by coarse-graining, but the latter is not seen as an averaging process (which it isn't in the standard formulations of coarse graining, except in limiting cases such as very dilute gases). Instead, coarse-graining is seen as an approximation process in which one restricts attention to a collection of relevant macroscopic variables and neglects small amplitude variations with high spatial or temporal frequencies.

vanhees71 · Jul 15, 2019

A. Neumaier said:

Well, surely a classical macroscopic device is also a quantum system, hence described by a quantum state. At least people who need to design detectors in silico treat the macroscopic system formed by the device as a quantum system.

Elsewhere you just said,

Thus you regard the mixed quantum state with rotation invariant density matrix as a randomized pure (polarized) quantum state. This exemplifies the rule, stated in all standard quantum mechanics books, that the density operator of any quantum system is regarded (by the orthodoxy supported, e.g., by Landau and Lifshitz) as representing an unknown pure state, randomized over the macroscopic uncertainty.

Well, it's one way to describe it. I'd not say that's the most general case. Another important example is if you have a quantum system that may be prepared in a pure state and you want to describe a subsystem, which you describe the (usually mixed-state) statistical operator you get from a partial trace.

I'd say that usually a measurement device is usually rather described in a mixed rather than a pure state.

vanhees71 · Jul 15, 2019

A. Neumaier said:

The macroscopic laws of a superfluid are as classical as the macroscopic laws of hydromechanics for water, though quantitatively slightly different. Both depend for their details (thermodynamic state functions) on quantum properties of matter. But the macroscopic limit is in both cases classical and deterministic.The thermal interpretation also explains this by coarse-graining, but the latter is not seen as an averaging process (which it isn't in the standard formulations of coarse graining, except in limiting cases such as very dilute gases). Instead, coarse-graining is seen as an approximation process in which one restricts attention to a collection of relevant macroscopic variables and neglects small amplitude variations with high spatial or temporal frequencies.

But that in fact IS the usual coarse-graining I'm talking about. You average over the many microscopic details to describe the average behavior of macroscopic observables. One usual formal way is the gradient expansion (which can also be formulated as a formal ##\hbar## expansion). So after all the thermal interpretation is again equivalent to the standard interpretation? Still puzzled...

A. Neumaier · Jul 15, 2019

vanhees71 said:

I'd say that usually a measurement device is usually rather described in a mixed rather than a pure state.

I agree. But this description is usually (and in particular by Landau & Lifshits) taken to be statistical, i.e., as a mixture indicating ignorance of the true pure state.

vanhees71 said:

Well, it's one way to describe it. I'd not say that's the most general case. Another important example is if you have a quantum system that may be prepared in a pure state and you want to describe a subsystem, which you describe the (usually mixed-state) statistical operator you get from a partial trace.

Well, you could consider the detector as being a subsystem of the lab; then the lab would be in an unkown pure state (but described by a mixture mostly in local equilibrium) and the detector would be described by a partial trace.

Within the traditional foundation you cannot escape assuming that the biggest system considered should be in a pure state if the details could be gathered for describing this state.

A. Neumaier · Jul 15, 2019

vanhees71 said:

But that in fact IS the usual coarse-graining I'm talking about. You average over the many microscopic details to describe the average behavior of macroscopic observables.

I don't see any of this in the usual 2PI formalism for deriving the coarse-grained kinetic equations of
Kadanoff-Baym, say.

vanhees71 said:

One usual formal way is the gradient expansion (which can also be formulated as a formal ##\hbar## expansion).

In which step, precisely, does the gradient expansion involve an average over microscopic details (rather than an ensemble average over imagined replicas of the fields) ?

vanhees71 said:

So after all the thermal interpretation is again equivalent to the standard interpretation? Still puzzled...

On the level of statistical mechanics, the thermal interpretation is essentially equivalent to the standard interpretation, except for the way of talking about things. The thermal interpretation talk is adapted to the actual usage rather than to the foundational brimborium.

On this level, the thermal interpretation allows one to describe with the multicanonical ensemble a single lump of silver, whereas tradition takes the ensemble to be a literal ensemble of many identically prepared lumps of silver - even when only one of a particular form (a statue of Pallas Athene, say) has ever been prepared.

vanhees71 · Jul 16, 2019

In the 2PI treatment for deriving the Kadanoff-Baym equation and then doing the gradient expansion to get quantum-transport equations you work in the Wigner picture, i.e., a Fourier transform in ##(x-y)##, where ##x## and ##y## are the space-time coordinates of the two-point (contour) Green's function, i.e., you neglect the rapid changes in the variable ##(x-y)##, which is effectively an averaging out of (quantum) fluctuations.

In standard quantum-statistical mechanics one very well describes single macroscopic objects like a lump of silver. The coarse graining is over microscopic large but macroscopic small space-time cells. The Gibbs ensemble is just a tool to think statistically about this (or to program Monte Carlo simulations ;-)).

You still have not made clear to me, what's the Thermal Interpretation really is, if I'm not allowed to think about the ##Tr(\hat{\rho} \hat{O})## as an averaging procedure!

A. Neumaier · Jul 16, 2019

vanhees71 said:

In the 2PI treatment for deriving the Kadanoff-Baym equation and then doing the gradient expansion to get quantum-transport equations you work in the Wigner picture, i.e., a Fourier transform in ##(x-y)##, where ##x## and ##y## are the space-time coordinates of the two-point (contour) Green's function, i.e., you neglect the rapid changes in the variable ##(x-y)##, which is effectively an averaging out of (quantum) fluctuations.

It smoothes rapid spatial changes irrespective of their origin. This is of the same kind as when in classical optics one averages over fast oscillations. It has nothing to do with microscopic degrees
of freedom - it is not an average over a large number of atoms or electrons!

vanhees71 said:

In standard quantum-statistical mechanics one very well describes single macroscopic objects like a lump of silver. The coarse graining is over microscopic large but macroscopic small space-time cells. The Gibbs ensemble is just a tool to think statistically about this (or to program Monte Carlo simulations ;-)).

As it is defined, the Gibbs ensemble is an ensemble of copies of the original lumps of silver. This was clearly understood in Gibbs' time, where it was a major point of criticism of his method! For example, on p.226f of

P. Hertz, Über die mechanischen Grundlagen der Thermodynamik, Ann. Physik IV. Folge (33) 1910, 225--274.

one can read:

Paul Hertz said:

So aufgefaßt, scheint die Gibbssche Definition geradezu widersinnig. Wie soll eine dem Körper wirklich eignende Größe abhangen nicht von dem Zustand, den er hat, sondern den er möglicherweise haben könnte? [...] Es wird eine Gesamtheit mathematisch fingiert [...] erscheint es schwierig, wenn nicht ausgeschlossen, dem Begriffe der kanonischen Gesamtheit eine physikalische Bedeutung abzugewinnen.

To reinterpret it as an ensemble of space-time cells is completely changing the meaning it has by definition!

vanhees71 said:

You still have not made clear to me, what's the Thermal Interpretation really is, if I'm not allowed to think about the ##Tr(\hat{\rho} \hat{O})## as an averaging procedure!

You may think of it as a purely mathematical computation, of the same kind as many other purely mathematical computations done in the derivation of the Kadanoff-Baym equations. You may think of the result as the ''macroscopic value'' of ##O##, lying somewhere in the convex hull of the spectrum of ##O##.

vanhees71 · Jul 16, 2019

I see. So it's just an extreme form of the shutup-and-calculate advice: you use the established math wothout any heuristics simply because it works. I find this quite nice, but it's hard to believe that without some heuristics in the relation to the abstract formalism QT would ever have been so successfully applied to the description of real-world processes.

In our interpretation of the standard derivation I think we agree, because indeed it's just the same averaging process as in classical statistics. That you may average over much more than just quantum fluctuations is also clear. That's done to the extreme when you further break the dynamics down to ideal hydro, i.e. assuming local equilibrium. From this you do the other direction in figuring in ever more fluctuations in varous ways to derive viscous hydro (Chapman-Enkog, method of moments etc.).

A. Neumaier · Jul 22, 2019

vanhees71 said:

I see. So it's just an extreme form of the shutup-and-calculate advice: you use the established math wothout any heuristics simply because it works. I find this quite nice, but it's hard to believe that without some heuristics in the relation to the abstract formalism QT would ever have been so successfully applied to the description of real-world processes.

The heuristic of ignoring tiny high frequency contributions in space or time - independent of any reference to the microscopic degrees of freedom - is very powerful and sufficient to motivate everything that works. For example, the gradient expansion can be motivated by single particle quantum mechanics where in the position representation, the gradient expansion is just an expansion into low powers of momentum, i.e., a low momentum = slow change expansion. One just keeps the least changing contributions. Clearly, this is not averaging over microscopic degrees of freedom.

vanhees71 said:

In our interpretation of the standard derivation I think we agree, because indeed it's just the same averaging process as in classical statistics.

Effectively, yes, since you are employing the averaging idea for much more than only statistics.
But from a strict point of view there is a big difference, since the averaging per se has nothing to do with statistics. The thermal interpretation is about giving up statistics at a fundamental level and to employ it only where it is needed to reduce the amount of noise. This makes the thermal interpretation applicable to single systems where at least the literal use of the traditional quantum postulates would require (as in Gibbs' time) the use of a fictitious ensemble of imagined copies of the whole system.

vanhees71 · Jul 23, 2019

I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure? It doesn't need to be an ensemble average. You can also simply "coarse grain" in the sense you describe it, i.e., averaging over "microscopically large, macroscopically small" space (or space-time) volums. This is in fact what's effectively done in the gradient expansion.

Of course, another argument for the gradient expansion as a means to derive effective classical descriptions for macroscopic quantities is that it can as well be formalized as an expansion in powers of ##\hbar##.

I also don't see a problem with the treatment of a single system in standard quantum theory within the standard statistical interpretation of the state since, whenever the classical approximation is valid, the standard deviations from the mean values of the macroscopic observables are irrelevant (that's a tautology), and then the probabilisitic nature of the quantum state is simply hard to observe and everything looks classical.

Take the famous ##\alpha##-particles in a cloud chamber as an example a la Mott. Each single particle seems to behave classical, i.e., following a classical (straight) trajectory, but of course it's because it's not a single-particle system at all, but a single particle interacting (practically continuously) with the vapor in the cloud chamber. The macroscopic trajectory, for which you can in principle observe position and velocity of the particle on macroscopic level of accuracy by just observing the trails building up during the particle is moving through the chamber, is due to this interaction "with the environment". For a single ##\alpha## particle in a vacuum originating from a single ##\alpha##-decaying nucleus you cannot say too much indeed: You neither know when it's exactly created nor in which direction it's flying, while all this is known simply by observation of the macroscopic trails of the ##\alpha## particle in the cloud chamber.

DarMM · Jul 23, 2019

vanhees71 said:

I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure?

It's a property of the system like angular momentum in Classical Mechanics.

A. Neumaier · Jul 23, 2019

vanhees71 said:

I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure? It doesn't need to be an ensemble average.

But you defined it in your lecture notes as being the average over the ensemble of all systems prepared in the state ##\rho##. Since you now claim the opposite, you should emphasize in your lecture notes that it does not need to be an ensemble average, but also applies to a single, uniquely prepared system, such as the beautifully and uniquely shaped lump of silver under discussion.

In my view, the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\rho A)## is crystal clear. It is the trace of a product of two Hermitian operators, expressing a property of the system in the quantum state ##\rho##, just like a function ##A(p,q)## of a classical Hamiltonian system expresses a property of the system in the given classical state ##(p,q)##.

Ostensibly, ##\langle A \rangle## is not an average of anything (unless you introduce additional machinery and then prove it to be such an average). If the Hilbert space is ##C^n##, it is a weighted sum of the ##n^2## matrix elements of ##A##, with in general complex weights. Nothing at all suggests this to be an average over microscopic degrees of freedom, or over short times, or over whatever else you may think of.

vanhees71 · Jul 24, 2019

Of course, I've defined it in this way, because it's most easy to derive the formalism from it. Also the mass is indeed crystal clear, but to do physics you need to know the meaning of the procedure to real-world observables. So still, if ##\langle A \rangle## is not an average, I don't know what it is and how to apply it to the real world.

Obviously you simply don't understand the argument, why the formal manipulations used to derive macroscopic (classical) behavior, be it from quantum or classical statistical physics, always is an averaging procedure over many microscopic degrees of freedom. That all started from the very beginning of statistical mechanics by Bernoulli, Maxwell, and Boltzmann. It has also been used in classical electrodynamics to describe the intensity of electromagnetic fields, particularly optics (where the averaging is a time average), the derivation of electromagnetic properties of matter from classical electron theory (where the averaging is over spatial cells) by Lorentz et al. The same holds true for hydrodynamics (local thermal equilibrium and the corresponding expansions around it to yield all kinds of transport coefficients). All this is completely analogous in Q(F)T. It's the same basic understanding of the physics underlying the mathematical techniques which indeed turned out to be successful.

So it seems as if the Thermal Interpretation is just the "shut-up-and-calculate interpretation" pushed into the extreme, such that it's not useful anymore for the (phenomenological) theoretical physicist. I've some sympathies for this approach, because it avoids philosophical gibberish confusing the subject, but if you don't allow heuristic thinking (like the extremely useful idea of the Gibbs ensemble), there's no chance to apply a theory to new physical problems in the real world.

A. Neumaier · Jul 24, 2019

vanhees71 said:

if ##\langle A \rangle## is not an average, I don't know what it is and how to apply it to the real world.

By your (i.e., the conventional minimal statistical) definition, it is an average over the ensemble of equally prepared systems, and nothing else.

How to apply it to the real world, e.g., to a single beautifully shaped lump of silver or to hydromechanics, should be a consequence of the definitions given. If you interpret it as another average, you therefore need to derive it from this original definition (which is possible only in very special model cases). Otherwise, why should one believe you?

vanhees71 said:

Obviously you simply don't understand the argument, why the formal manipulations used to derive macroscopic (classical) behavior, be it from quantum or classical statistical physics, always is an averaging procedure over many microscopic degrees of freedom. That all started from the very beginning of statistical mechanics by Bernoulli, Maxwell, and Boltzmann.

Yes, I really don't understand it, since in this generality it is simply false. Your argument is valid only in the special case where you assume (as Bernoulli, Maxwell, and Boltzmann) an ideal gas, so that you have an ensemble of independent particles and not (as in dense matter and in QFT) an ensemble of large systems.

vanhees71 said:

if you don't allow heuristic thinking (like the extremely useful idea of the Gibbs ensemble), there's no chance to apply a theory to new physical problems in the real world.

The thermal interpretation turns the heuristic thinking of Gibbs (where people complained how the property of a realization can depend on a theory for all the possibilities, which is indeed not sensible) into something rational that needs no heuristic anymore. Physicists are still allowed to use in addition to the formally correct stuff all the heuristics they are accustomed to, as long as it leads to correct predictions, just as they use the heuristics of virtual particles popping in and out of existence, while in fact they just work with the formal rules.

vanhees71 · Jul 24, 2019

Sure, in practice nearly everything is mapped to an ideal gas of quasiparticles, if possible, and it's amazing how far you come with this strategy. Among other things it can describe the color of a shiny lump of silver or the hydrodynamics of a fluid.

nicf · Jul 28, 2019

Michael Price said:

Decoherence does solve a big part the measurement problem, because it shows how wavefunctions will seem to collapse upon measurement, without actually having to postulate that they do actually collapse. The demonstration of this is just a technical matter, independent of any interpretation. What it doesn't explain is where probabilities come from - that is another story.

This is a clearer way of saying exactly what I meant, thank you :). Let me use this as a jumping-off point to try to state my original question more clearly, since I think I am still confused.

The part of the measurement problem that's relevant to my question is exactly the part that decoherence doesn't try to solve: what determines which of (say) two possible measurement outcomes I actually end up seeing? The reason I'm confused is that, when I try to combine what I understand about decoherence with what I understand about the account described in the thermal interpretation, I arrive at two conclusions that don't line up with each other:

(a) The decoherence story as it's usually given explains how, using ordinary unitary quantum mechanics with no collapse, I end up in a state where neither possible outcome can "interfere" with the other (since both outcomes are entangled with the environment), thereby explaining why the wavefunction appears to collapse. But if I write down a mathematical description of the final state, there are parts of it that correspond to both of the two possibilities with no way to choose between them. This explanation comes with the additional claim that, due to the linearity of time evolution, there's no possible way that the final state could privilege one outcome over the other. (Bell is also often invoked here, although I don't know if I know how to turn that into a proof.)

(b) The description in the thermal interpretation papers seems to claim that, in fact, if I had a good enough description of the details of the initial state of the microscopic system and the measurement apparatus, I would be able to deduce which of the two possibilities "really happened", and that I could do this, again, using only ordinary unitary quantum mechanics with no collapse.

Since both stories use the same initial condition and the same rule for evolving in time, these seem to be two different claims about the exact same mathematical object --- the density matrix of the final state of the system. If that's true, then one of them ought to be wrong. The claim in (b) is much stronger than (a), and I think that if it works any reasonable person ought to regard something like (b) as a solution to the measurement problem! It would certainly be enough to satisfy me. But I've heard (a) enough times that I'm confused about (b). Is your position that the (a) story is incorrect, or am I misunderstanding something else?

PeterDonis · Jul 28, 2019

nicf said:

I could do this, again, using only ordinary unitary quantum mechanics with no collapse.

I'm not sure this is possible. Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.

My understanding of the thermal interpretation (remember I'm not its author so my understanding might not be correct) is that the two non-interfering outcomes are actually a meta-stable state of the detector (i.e., of whatever macroscopic object is going to irreversibly record the measurement result), and that random fluctuations cause this meta-stable state to decay into just one of the two outcomes. An analogy that I have seen @A. Neumaier use is a ball on a very sharp peak between two valleys; the ball will not stay on the peak because random fluctuations will cause it to jostle one way or the other and roll down into one of the valleys.

However, the dynamics of this collapse of a meta-stable detector state into one of the two stable outcomes can't be just ordinary unitary QM, because ordinary unitary QM is linear and linear dynamics can't do that. In ordinary unitary QM, fluctuations in the detector would just become entangled with the system being measured and would preserve the multiple outcomes. There would have to be some nonlinear correction to the dynamics to collapse the state into just one outcome.

nicf · Jul 29, 2019

PeterDonis said:

I'm not sure this is possible. Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.

Exactly, that's why I'm confused! My impression is that @A. Neumaier is somehow denying this, and that somehow the refusal to describe macroscopic objects with state vectors is related to the way he gets around this linearity argument, although I don't see how.

If we're supposed to be positing nonunitary dynamics on a fundamental level, then that would obviate my whole question, but from the papers I understood @A. Neumaier to be specifically not doing that.

A. Neumaier · Jul 29, 2019

nicf said:

(b) The description in the thermal interpretation papers seems to claim that, in fact, if I had a good enough description of the details of the initial state of the microscopic system and the measurement apparatus, I would be able to deduce which of the two possibilities "really happened", and that I could do this, again, using only ordinary unitary quantum mechanics with no collapse.

Correct.

nicf said:

Since both stories use the same initial condition and the same rule for evolving in time, these seem to be two different claims about the exact same mathematical object --- the density matrix of the final state of the system. If that's true, then one of them ought to be wrong.

No. Decoherence tells the same story but only in the statistical interpretation (using Lindblad equations rather than stochastic trajectories), where ensembles of many identically prepared systems are considered, so that only the averaged results (which must feature all possibilities) can be deduced. The thermal interpretation refines this to a different, more detailed story for each single case. Averaging the latter recovers the former.

PeterDonis said:

My understanding of the thermal interpretation (remember I'm not its author so my understanding might not be correct) is that the two non-interfering outcomes are actually a meta-stable state of the detector (i.e., of whatever macroscopic object is going to irreversibly record the measurement result), and that random fluctuations cause this meta-stable state to decay into just one of the two outcomes. An analogy that I have seen @A. Neumaier use is a ball on a very sharp peak between two valleys; the ball will not stay on the peak because random fluctuations will cause it to jostle one way or the other and roll down into one of the valleys.

Correct.

PeterDonis said:

However, the dynamics of this collapse of a meta-stable detector state into one of the two stable outcomes can't be just ordinary unitary QM, because ordinary unitary QM is linear and linear dynamics can't do that. In ordinary unitary QM, fluctuations in the detector would just become entangled with the system being measured and would preserve the multiple outcomes. There would have to be some nonlinear correction to the dynamics to collapse the state into just one outcome.

I explained how the nonlinearities naturally come about through coarse graining. An example of coarse graining is the classical limit, where nonlinear Hamiltonian dynamics arises from linear quantum dynamics for systems of sufficently heavy balls. This special case is discussed in Section 2.1
of Part IV, and explains to some extent why heavy objects behave classically but nonlinearly.

nicf said:

Exactly, that's why I'm confused! My impression is that @A. Neumaier is somehow denying this, and that somehow the refusal to describe macroscopic objects with state vectors is related to the way he gets around this linearity argument, although I don't see how.

If we're supposed to be positing nonunitary dynamics on a fundamental level, then that would obviate my whole question, but from the papers I understood A. Neumaier to be specifically not doing that.

As you can see from the preceding, this is not necessary.

A. Neumaier · Jul 29, 2019

PeterDonis said:

Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.

This is according to the traditional interpretations, where outcome = eigenvalue, which must be statistical. But in the thermal interpretation, outcome = q-expectation, which is always single-valued. This makes a big difference in the interpretation of everything! See Chapter 4 of Part IV.

vanhees71 · Jul 29, 2019

No it goes in circles again. The identification of expectation values with meaurement outcomes had to be abandoned very early in the history of the development of QT. E.g., it contradicts the fact that the absorption and emission of electromagnetic radiation by charged-matter systems is in discrete "lumps of energy" ##\hbar \omega##. That's in fact how the whole quantum business started with Planck's analysis of the black-body spectrum.

A. Neumaier · Jul 29, 2019

vanhees71 said:

Now it goes in circles again. The identification of expectation values with measurement outcomes had to be abandoned very early in the history of the development of QT.

The observable outcome is a property of the detector, e.g., a photocurrent. This is not quantized but a continuous burst, for each single observation of a detection event. The derivation of macroscopic electrodynamics from QED by QED shows that the measured currents are q-expectations. No experiment in the history of quantum mechanics contradicts this.

The relation between observed outcome and true result is in general approximate (especially when the spectrum spacing is of the order of the observation error or larger) and depends of what one considers to be the true (unobservable) result. This is not observable hence a matter of interpretation.

Here tradition and the thermal interpretation differ in what they postulate to be the true result, i.e., how to split the observed result (a left spot and a right spot) into a true result (eigenvalue or q-expectation) and an observational error (the difference). See Sections 4.1 and 4.2 of Part IV.

Since this doesn't change the experimental record it is cannot be contradicted by any experiment.

vanhees71 said:

E.g., it contradicts the fact that the absorption and emission of electromagnetic radiation by charged-matter systems is in discrete "lumps of energy" ##\hbar \omega##. That's in fact how the whole quantum business started with Planck's analysis of the black-body spectrum.

The black body spectrum was explained by Bose 1924 by the canonical ensemble of a Bose-Einstein gas, although Planck had derived it from quantized lumps of energy. Only a quantized spectrum is needed, no discrete lumps of radiation energy.

Just as the photoeffect could be explained by Wentzel 1928 with classical light (no lumps of energy), although Einstein had originally explained it in terms of quantized light. Only quantized electrons are needed, no discrete lumps of radiation energy.

vanhees71 · Jul 29, 2019

Well, Planck's derivation was in terms of a canonical ensemble of a Bose-Einstein gas. Of course, at this time it wasn't known as such.

The "q-expectation" value in general does not reflect what's measured by an actual device. For this you'd have to put information on the device into the description. This of course always have to be done when data from a real detector are evaluated, but it cannot be part of the general description of a system.

Nowadays it's no problem to prepare single photons, and all experiments show that an "entire photon" is registered (if it is registered at all) but not some fraction of a photon. So obviously at this (today indeed technically realized!) resolution, you measure discrete photon energies ##\hbar \omega## and not some expectation value.

A. Neumaier · Jul 29, 2019

vanhees71 said:

Well, Planck's derivation was in terms of a canonical ensemble of a Bose-Einstein gas.

For the equilibrium thermodynamics of a Bose-Einstein gas one doesn't need anything more than the maximum entropy state corresponding to the q-expectation of a Hamiltonian with discrete eigenvalues. The according to you necessary interpretation as discrete lumps of energy nowhere enters.

vanhees71 said:

The "q-expectation" value in general does not reflect what's measured by an actual device.

Any measured current is the q-expectation of a smeared version of the QED current; only the details of the smearing depend on the actual device. This is the q-expectation that is relevant for the thermal interpretation.

vanhees71 said:

Nowadays it's no problem to prepare single photons, and all experiments show that an "entire photon" is registered (if it is registered at all) but not some fraction of a photon. So obviously at this (today indeed technically realized!) resolution, you measure discrete photon energies ##\hbar \omega## and not some expectation value.

No. What is measured in each single photodetection event (called a photon by convention) is a magnified current of energy much larger than ##\hbar \omega##.

vanhees71 · Jul 29, 2019

Indeed, the Hamiltonian, representing the energy of the system (in this case an ensemble of non-interacting harmonic oscillators, representing the em. field), takes discrete values, which are the possible outcomes of precise measurement of energy, while the expectation values can take all continuous values ##\geq 0##.

A. Neumaier · Jul 29, 2019

vanhees71 said:

the Hamiltonian, representing the energy of the system (in this case an ensemble of non-interacting harmonic oscillators, representing the em. field), takes discrete values, which are the possible outcomes of precise measurement of energy

Well, only energy differences are measured, and in general many at the same time (through emission or absorption spectra). These all have widths and do not give exact values, and recovering from a spectrum the energy levels is a highly nontrivial process.

Thus the connection between measured values (always on a continuous scale, a q-expectation of something macroscopic) to the theoretical true values is always somewhat indirect, and therefore the designation of something (eigenvalue or q-expectation) as true measurement value is a matter of interpretation. Tradition and the thermal interpretation differ in the choice made in this.

PeterDonis · Jul 29, 2019

A. Neumaier said:

in the thermal interpretation, outcome = q-expectation, which is always single-valued.

Yes, you're right, I left that part out. But for the benefit of @nicf, it might be worth spelling out how this works in the case of a simple binary measurement such as the Stern-Gerlach experiment. Say we are measuring a spin-z up electron using a Stern-Gerlach apparatus oriented in the x direction. Then we have the following account of what happens according to ordinary QM vs. the thermal interpretation:

(a) Ordinary QM: the measurement creates an entanglement between the spin of the electron and its momentum (which direction it comes out of the apparatus). When this entangled state interacts with the detector, decoherence occurs, which produces two non-interfering outcomes. How this becomes one outcome (or the appearance of one) depends on which interpretation you adopt (where "interpretation" here means basically collapse vs. no collapse, something like Copenhagen vs. something like MWI).

(b) Thermal interpretation: The q-expectation of the measurement is zero (an equal average of +1 and -1), but each individual measurement gives an inaccurate result because of the way the measurement/detector are constructed, so only the average over many results on an ensemble of identically prepared electrons will show the q-expectation. For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.

nicf · Jul 29, 2019

PeterDonis said:

(b) Thermal interpretation: The q-expectation of the measurement is zero (an equal average of +1 and -1), but each individual measurement gives an inaccurate result because of the way the measurement/detector are constructed, so only the average over many results on an ensemble of identically prepared electrons will show the q-expectation. For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.

I might be misunderstanding you, but I actually think (b) is not what @A. Neumaier is saying, at least if by "random nonlinear fluctuations" you mean that there's some change to the unitary dynamics underlying quantum mechanics. Rather, he's saying that the nonlinearity comes from coarse-graining, that is, from neglecting some details of the state, which would actually evolve linearly if you could somehow add those details back in.

This was my reading from the beginning, and is the source of my question. I feel quite stupid right now and that I must be missing something obvious, but I'm going to press on anyway and try to be more specific about my confusion.

----------------------------------

Let's start with the setup in Section 3 of the fourth TI paper, where we've written the Hilbert space of the universe as ##H=H^S\otimes H^E## where ##H^S## is two-dimensional, and assume our initial state is ##\rho_0=\rho^S\otimes\rho^E##. We have some observable ##X^E## on ##H^E##, and we're thinking of this as being something like the position of the tip of a detector needle. Using thermal interpretation language, we can say that we're interested in the "q-probability distribution" of ##X^E## after running time forward from this initial state; this can be defined entirely using q-expectation values, so I think @A. Neumaier and I agree that this is a physically meaningful object to discuss. If, after some experiment, the q-probability distribution of ##X^E## has most of its mass near some particular ##x\in\mathbb{R}##, then I'm happy to say that there's no "measurement problem" with respect to that experiment.

Consider two state vectors ##\psi_1## and ##\psi_2##, and suppose they are orthogonal, and pick some initial density matrix ##\rho^E## for the environment. Suppose that:
(i) starting in the state ##\psi_1\psi_1^*\otimes\rho^E## and running time forward a while yields a q-probability distribution for ##X^E## with a single spike around some ##x\gg 0##, and
(ii) similarly with ##\psi_2## around ##-x##.

The question then is: what does the q-probability distribution of ##X^E## look like if we instead start with ##\frac12(\psi_1+\psi_2)(\psi_1+\psi_2)^*\otimes\rho^E##?

The two competing answers I'm worried about are:
(a) It will be bimodal, with a peak around ##x## and a peak around ##-x##
(b) It will be unimodal, concentrated around ##-x## or around ##x##, with the choice between the two depending in some incredibly complicated way on the exact form of ##\rho_E##. (In this story, maybe there's a choice of ##\rho^E## that will give something like (a), but it would require a ludicrous amount of symmetry and so there's no need to worry about it.)

The reason I'm confused, then, is that I thought that the decoherence story involves (among other things) deducing (a) from (i) and (ii). In particular, I thought it followed from the linearity of time evolution together with the whole business with decaying off-diagonal terms in the density matrix, but I don't understand the literature enough to be confident here.

Am I just wrong about what the decoherence story claims? Is it just that they assume enough symmetry in ##\rho^E## to get (a) to happen, but actually (b) is what happens for the majority of initial environment states? I can see that this would be a sensible thing to do if you think of ##\rho^E## as representing an ensemble of initial environments rather than the "true" initial environment.

There is also the separate claim that, if the environment starts in a pure state (a possibility which TI denies but many other interpretations don't) then the linearity alone should leave the whole universe in a superposition of "what would have happened if ##\rho^S## were ##\psi_1##" and the same with ##\psi_2##, which feels to me like it ought to imply an outcome like (a), and it seems like I could then extract (a) for an arbitrary ##\rho^E## by writing it as a convex combination of pure states. I assume this paragraph also contains a mistake, but I would be interested to know where it is.

Confusion about the thermal interpretation's account of measurement

Similar threads

Hot Threads

Recent Insights