Does a measurement setup determine the reality of spin measurement outcomes?

vanhees71 · Sep 9, 2019

Where in the usual applications in statistical physics does the entropy become negative?

A. Neumaier · Sep 9, 2019

vanhees71 said:

Where in the usual applications in statistical physics does the entropy become negative?

In Boltzmann's H-theorem. since there the energy has a continuous spectrum.

vanhees71 · Sep 10, 2019

In the usual definition you start with a finite volume and thus the discrete case. Obviously the entropy is always positive,
$$S=-\mathrm{Tr} \hat{\rho} \ln \hat \rho=-\sum_{i} \rho_i \ln \rho_i,$$
where ##\rho_i## are the eigenvalues of ##\hat{\rho}##. Since ##\hat{\rho}## is positive semidefinite and ##\mathrm{Tr} \hat{\rho}=1## you have ##\rho_i \in [0,1]##. For ##\rho_i=0## by definition in the entropy formula you have to set ##\rho_i \ln \rho_i=0##. Thus ##S \geq 0##. Taking the thermodynamic limit keeps ##S \geq 0##.

At which point in the derivation of the Boltzmann equation becomes ##S<0## then?

A. Neumaier · Sep 10, 2019

vanhees71 said:

In the usual definition you start with a finite volume and thus the discrete case. Obviously the entropy is always positive,
$$S=-\mathrm{Tr} \hat{\rho} \ln \hat \rho=-\sum_{i} \rho_i \ln \rho_i,$$
where ##\rho_i## are the eigenvalues of ##\hat{\rho}##. Since ##\hat{\rho}## is positive semidefinite and ##\mathrm{Tr} \hat{\rho}=1## you have ##\rho_i \in [0,1]##. For ##\rho_i=0## by definition in the entropy formula you have to set ##\rho_i \ln \rho_i=0##. Thus ##S \geq 0##. Taking the thermodynamic limit keeps ##S \geq 0##.

At which point in the derivation of the Boltzmann equation becomes ##S<0## then?

In the above, you didn't discuss Boltzmann entropy but von Neumann entropy. It is definable only for trace class operators ##\rho##, which necessarily have discrete spectrum. Thus they produce discrete probability densities, which of cource may be interpreted in terms of information theory. (Though it is artiicial as one cannot implement on the quantum level the decision procedure that gives rise to the notion of Shannon entropy. )

On the other hand, Boltzmann entropy is an integral over classical phase space. He didn't yet know quantum physics when he introduced the H-theorem. Boltzmann entropy can be negative and has no interpretation in terms of information theory.

vanhees71 · Sep 10, 2019

Well, supposedly that's then the deeper reason for the necessity of starting with some "regularization" such that the statistical operator has a discrete spectrum, and the thermodynamic limit is not that trivial.

That there's trouble with entropy in classical statistics is well-known since Gibbs ;-)).

A. Neumaier · Sep 10, 2019

vanhees71 said:

Well, supposedly that's then the deeper reason for the necessity of starting with some "regularization" such that the statistical operator has a discrete spectrum, and the thermodynamic limit is not that trivial.

This is not a regularization. Real materials to which statistical mechanics applies have bounded volume.

The thermodynamic limit is an idealization in which all uncertainties vanish, and hence all statistical connotations disappear.

A. Neumaier · Sep 10, 2019

A. Neumaier said:

At least Peres is more careful and consistent than you.

You are using Born's rule claiming, in (2.1.3) in your lecture notes, that measured are exact eigenvalues - although these are never measured exactly -, to derive on p.21 the standard formula for the q-expectation (what you there call the mean value) of known observables (e.g., the mean energy ##\langle H\rangle## in equilibrium statistical mechanics) with unknown (most likely irrational) spectra. But you claim that the resulting q-expectation is not a theoretical construct but is ''in agreement with the fundamental definition of the expectation value
of a stochastic variable in dependence of the given probabilities for the outcome of a measurement of this variable.'' This would hold only if your outcomes match the eigenvalues exactly - ''accurately'' is not enough.

vanhees71 said:

We have discussed this a zillion of times. This is the standard treatment in introductory text, and rightfully so, because you have to first define the idealized case of precise measurements. Then you can generalize it to more realistic descriptions of imprecise measurements.

But an idealized case can be no more than a didactical prop. Good foundations must be general enough to support all uses.

Your position on the foundations of quantum mechanics is like calling systems of harmonic oscillators the foundations of classical mechanics, because you have to first define the idealized case of precise oscillations. But the true foundations of classical mchnaics are the Lagrangian and Hamiltonian formalisms!

Similarly, in quantum mechnaics, true foundations must feature open systems and POVMs, since these describe the realistic scenarios.

vanhees71 · Sep 10, 2019

We want to do physics. In the way, how POVMs are introduced by Peres, you'd never have discovered QT as a tool to describe what's observed. He also defines a quantum test as a projection operation to a one-dimensional subspace in Hilbert space. As he rightfully stresses, there are no Hilbert spaces nor hermitean operators in the lab but real-world equipment. E.g., where in the quantum-optics literature have you ever needed POVMs rather than the standard formulation of QT to understand all the stringent Bell tests? There you meausure photon detection rates of various kinds and analyze them in terms of an appropriate initial state of your photons and the n-photon correlation functions, sometimes also corrected for non-ideal detectors.

This is also the main obstacle for a physicist to understand what you want to say in your "thermal interpretation". A physical theory needs more than the statement of some axioms. You need an operational meaning, i.e., how to apply the formalism to the operations with real-world equipment in the lab. That's what "interpretation" is all about. There's no necessity for philosophical confusions, and overly abstract axiomatic foundations without relation to real-world experiments may be nice mathematical edifices of pure thought but have not much to do with theoretical physics applicable to phenomenology. A famous example is string theory and its relatives ;-)).

A. Neumaier · Sep 10, 2019

vanhees71 said:

In the way, how POVMs are introduced by Peres, you'd never have discovered QT as a tool to describe what's observed.

Well, it was neither discovered through how Born's rule is introducd by you, but through trying to theoretically understand black body radiation, the photoeffect, and spectra. Good foundations should not follow the way of discovery, which is often erratic and tentative, but should procvide the concepts needed to be able to handle the general situation by sraightforward specialization.

vanhees71 said:

where in the quantum-optics literature have you ever needed POVMs rather than the standard formulation of QT to understand all the stringent Bell tests?

They are needed to characterize the equipment in such a way that one can talk reliably about efficiencies nd close various loopholes. Most of quantum optics works with POVMs rather than von Neumann measurements; these figure only in the simplified accounts.

vanhees71 said:

You need an operational meaning, i.e., how to apply the formalism to the operations with real-world equipment in the lab. That's what "interpretation" is all about.

Yes, and that's why one needs POVMs rather than Bon's rule. Only in introductory courses is the latter sufficient.

vanhees71 · Sep 10, 2019

Why then are POVMs so rarely used in practice? I've not seen them used in quantum optics papers dealing with the foundations. Can you point me to one, where they are needed to understand an experiment?

A. Neumaier · Sep 10, 2019

vanhees71 said:

Why then are POVMs so rarely used in practice? I've not seen them used in quantum optics papers dealing with the foundations. Can you point me to one, where they are needed to understand an experiment?

They are used a lot for different purposes. For example, any quantum phase measurement is necessarily a POVM,
https://iopscience.iop.org/article/10.1088/0954-8998/3/1/002/metaso is any joint measurement of position and momentum (or the quadratures in quantum optics)To see the principles at work, one can works with the simple, idealized version. This is the stuff discussed in textbooks, popular articles, and theoretical papers where the idealization simplifies things a lot.

But to see the limits of real equipment one needs the POVMs - no real detector is ideal . Then it gets messy, e.g.,
https://arxiv.org/pdf/1204.1893and specifically in the context of Bell inequalities:
https://arxiv.org/pdf/quant-ph/0007058 https://arxiv.org/pdf/1304.7460
https://arxiv.org/pdf/quant-ph/0407181This is why people avoid the details if possible and just use simple efficiency proxys.

Some other papers:
https://arxiv.org/pdf/quant-ph/9809063 https://arxiv.org/pdf/quant-ph/0011042 https://arxiv.org/pdf/0804.3082 https://arxiv.org/pdf/1304.7460 https://journals.aps.org/pra/abstract/10.1103/PhysRevA.82.062115 https://arxiv.org/pdf/1007.3043 https://arxiv.org/pdf/quant-ph/0608128 https://arxiv.org/pdf/1111.5874 https://arxiv.org/pdf/1206.6054
start looking, and you find a nearly endless collection of work...

vanhees71 · Sep 10, 2019

A. Neumaier said:

https://arxiv.org/pdf/1204.1893

Well, this is very clearly using the standard quantum-theoretical formalism to construct (!) the POVM description of the measurement device. That rather confirms my view on the POVM formalism than is an argument against it.

A. Neumaier · Sep 10, 2019

vanhees71 said:

Well, this is very clearly using the standard quantum-theoretical formalism to construct (!) the POVM description of the measurement device.

What do you mean by ''construct the POVM description of the measurement device''?

In general, any device has a POVM description that cannot be postulated to be of Born type but needs to be found out by quantum tomography, using the formula (1) for probabilities. This is not Born's riule but a proper extension of it. It cannot be reduced to Born unless one adds nonphysical stuff (ancillas, that have no physical representation) to the description!

vanhees71 · Sep 11, 2019

Are we talking about the same paper? Eq. (1) IS Born's rule. What else should it be?

A. Neumaier · Sep 11, 2019

vanhees71 said:

Are we talking about the same paper?

Yes, but you didn't read it carefully enough.

vanhees71 said:

Eq. (1) IS Born's rule. What else should it be?

No. In equation (1) on p.2 of https://arxiv.org/pdf/1204.1893, $\Pi_n$ is an arbitrary positive operator from a POVM. Born's rule in its most general form is only the special case of (1) where all $\Pi$ are orthogonal projectors.
https://arxiv.org/pdf/1204.1893

vanhees71 · Sep 11, 2019

Why is (1) not Born's rule? I thought ##\Pi_n## is still a self-adjoint operator. At least that's the case in the treatment of the POVM formalism in Peres's textbook. The only difference is that the ##\Pi_n## are not orthonormal projectors as in the special case of ideal von Neumann filter measurements.

Also, as far as I understand the paper is about, how to determine the POVM for a given apparatus, and the necessary analysis is through the standard formalism of measurements in quantum optics, using a sufficiently large set of input states (in this case they use coherent states, aka Laser light).

I still don't see, in which sense the POVM formalism is an extension of standard QT. To the contrary it's based on standard QT, applied to open systems in contradistinction to the idealized description of measurements on closed systems.

A. Neumaier · Sep 11, 2019

vanhees71 said:

Why is (1) not Born's rule? I thought ##\Pi_n## is still a self-adjoint operator.

##\Pi_n## is Hermitian and bounded, hence self-adjoint. But in the formula for probabilties in Born's rule only orthogonal projection operators figure;:

vanhees71 said:

The only difference is that the ##\Pi_n## are not orthonormal projectors as in the special case of ideal von Neumann filter measurements.

This is an essential difference. It means that Born's rule is only a very special and often unrealistic (i.e., wrong!) case of the correct rule calculating probabilities for quantum detectors. To write down the correct rule in an introduction to quantum mechnaics would in fact be easier than writing down Born's rule, because one needs no discussion of the spectral theorem. Thus there is no excuse for giving in the foundations a special, highly idealized case in place of the real thing.

vanhees71 said:

Also, as far as I understand the paper is about, how to determine the POVM for a given apparatus, and the necessary analysis is through the standard formalism of measurements in quantum optics

The method is quantum tomography, which is based on POVM's and semidefinite programming only, nothing else. Of course it needs sources with known density operator. Only for these, textbook quantum optics is used.

vanhees71 said:

I still don't see, in which sense the POVM formalism is an extension of standard QT. To the contrary it's based on standard QT, applied to open systems in contradistinction to the idealized description of measurements on closed systems.

On a closed system, one cannot make a measurement at all, not even one satisfying Born's original rule.
Thus your distinction is meaningless.

The POVM formalism applies (quantitatively correctly) in exactly the same circumstances where Born's rule is claimed to apply (qualitatively, quantitatively only in ideal cases): Between the source and the detector, the system under discussion in the paper (and everywhere else) can be as closed as you can make it; it doesn't change at all the POVM properties of the detector. It is only the detection process itself where the system is open for a moment.

vanhees71 · Sep 11, 2019

Fine, I'd still not know, how to teach beginners in QT using the POVM concept. I'd not think that a book like Peres's is adequate for this purpose. It's not clear to me, what he considers concretely to be a "quantum test". Interestingly enough he introduces the POVM using the Born rule in the standard way for a closed system tracing out what he calls the "ancilla". I don't think it's possible to introduce POVMs for physicists without using the standard formulation in the usual terms of observables and states.

A. Neumaier · Sep 11, 2019

vanhees71 said:

The only difference is that the ##\Pi_n## are not orthonormal projectors as in the special case of ideal von Neumann filter measurements.

Another important difference is that a POVM measurement makes no claim about which values are measured.

It just says that one of the detectors making up the detection device responds with a probability given by the trace formula. The value assigned to the ##k##th detection event is pure convention, and can be any number ##a_k## - whatever has been written on the scale the pointer points to, or whatever has been programmed to be written by an automatic digital recording device. This is reality. Nothing about eigenvalues.

The state dependent formula for the expectation of the observable measured that follows from POVM together with the value assignment is ##\langle A\rangle=Tr~\rho A## with the operator ##A=\sum a_k\Pi_k##.
Note that the same operator ##A## in the expectation can be decomposed in many ways into a linear combination of many POVM terms. The spectral decomposition is just the historically first one, but usually not the most realistic one.

This is similar to the classical situation where a detector returns a number measured in inches or measured in cm, depending on how you label the scale. You could also measure length-squared by changing the scale nonlinearly.

A. Neumaier · Sep 11, 2019

vanhees71 said:

he introduces the POVM using the Born rule in the standard way for a closed system tracing out what he calls the "ancilla".

This is not the introduction, but after having already introduced POVMs he shows that the concept is consistent with the traditional setting, but on an (unphysical, just formally constructed) extended Hilbert space.

DarMM · Sep 11, 2019

A. Neumaier said:

This is not the introduction, but after having already introduced POVMs he shows that the concept is consistent with the traditional setting, but on an (unphysical, just formally constructed) extended Hilbert space.

This is the whole "church of the smaller/larger Hilbert space" issue in quantum foundations. Whether POVMs are fundamental or if they're always PVMs with ancillas.

A. Neumaier · Sep 11, 2019

DarMM said:

This is the whole "church of the smaller/larger Hilbert space" issue in quantum foundations. Whether POVMs are fundamental or if they're always PVMs with ancillas.

Well, at least once you go to QFT, there is no natural way to add the ancillas. It is a purely formal trick to reduce POVMs and related measurement issues to the standard (problematic) foundations.

DarMM · Sep 11, 2019

A. Neumaier said:

Well, at least once you go to QFT, there is no natural way to add the ancillas. It is a purely formal trick to reduce POVMs and related measurement issues to the standard (problematic) foundations.

I agree. Just to inform people of the terms should they encounter them. I myself can't my sense of the "always due to an ancilla" view of POVMs.

Elias1960 · Sep 11, 2019

A. Neumaier said:

How can the wave function be not ontic when its dynamics determines the positions at future times?
Something nonexistent cannot affect the existent.

This happens in the objective Bayesian probability interpretation. There exists some reality, and there exists incomplete but nonetheless objective information about it. It defines a probability distribution - the one which maximizes entropy given the particular information.

If time changes and no new information appears, there will be dynamics - equations which derive that probability distribution for later times from that of the initial time.

There is already a well-developed version of thermodynamics based completely on this interpretation of probability. In this version, the entropy is not something really existing, but a function characterizing our incomplete knowledge of what really exists.

And it has been extended by Caticha to an interpretation of quantum theory too, based on the formulas of Nelsonian stochastics.

Caticha, A. (2011). Entropic Dynamics, Time and Quantum Theory, J Phys A 44:225303, arxiv:1005.2357

It assumes that there are, beyond the configuration q, also some other variables y (which may be simply the configuration of everything else, including the preparation device). Incomplete knowledge of all this is some [itex]\rho(q,y)[/itex] Then one restricts this incomplete knowledge to knowledge about the system itself, integrating over this probability:
## \rho(q) = \int_{y\in Y} \rho(q,y) d y ##
## S(q) = \int_{y\in Y} -\ln \rho(q,y) \rho(q,y) d y ##
and then uses those two functions to define the wave function.

A. Neumaier · Sep 11, 2019

Elias1960 said:

there exists incomplete but nonetheless objective information about it. It defines a probability distribution - the one which maximizes entropy given the particular information.

This is a very questionable statement.

What is the probability distribution of something of which the information is given that 5 times someone observed 1, twice 3 was observed, and once 6 was observed? One cannot maximize the entroy given this information. But all information one can gather about aspects of the universe is information of this kind.

Elias1960 · Sep 11, 2019

A. Neumaier said:

This is a very questionable statement.

What is the probability distribution of something of which the information is given that 5 times someone observed 1, twice 3 was observed, and once 6 was observed? One cannot maximize the entropy given this information. But all information one can gather about aspects of the universe is information of this kind.

One can. One starts with the state of zero additional information. This maximizes entropy for 1/6 for each number of the dice. Then one adds the information given and uses the formula for "Bayesian updating" to compute the probability distribution which takes into account that new information.

PeterDonis · Sep 11, 2019

Elias1960 said:

One starts with the state of zero additional information. This maximizes entropy for 1/6 for each number of the dice. Then one adds the information given and uses the formula for "Bayesian updating" to compute the probability distribution which takes into account that new information.

What are you updating? The probabilities for the 6 possible rolls of the die?

Elias1960 · Sep 11, 2019

PeterDonis said:

What are you updating? The probabilities for the 6 possible rolls of the die?

Yes. See https://en.wikipedia.org/wiki/Bayesian_inference for the formulas.

PeterDonis · Sep 11, 2019

Elias1960 said:

Yes. See https://en.wikipedia.org/wiki/Bayesian_inference for the formulas.

Those aren't formulas for updating the probabilities of the 6 rolls of the die. They're formulas for updating the probabilities of hypotheses. What hypotheses are you updating the probabilities of?

romsofia · Sep 11, 2019

A passage about measurement from Bryce DeWitt that I enjoy:
"Much of the earlier work on the measurement problem, influenced no doubt by Bohr's shadow, emphasized the need for permanent information storage and hence, for complexity, metastability, ergodicity, etc. This emphasis was misplaced. One does not gain understanding by making a problem more complicated. A measurement is simply the establishment of a correlation between a "system" observable an an "apparatus" observable. It is the function of the apparatus to "observe" the system, not vice versa, and hence there is a fundamental asymmetry between them. It turns out that there are two prominent features that characterize a good apparatus: Its "pointer" must be in a localized quantum state, and it must be massive compared to the system. That is all". (Taken from his essay "Decoherence without Complexity and without an Arrow of Time" in the book"Physical origins of time asymmetry").

So to OP, yes, the measurement is real. That's the whole point of the experiment! I think an important point, especially when considering a simple experiment like the SG is: "One does not gain understanding by making a problem more complicated." If you turn the experiment upside down, to the left, to the right isn't the point of the experiment. It's that orientation of the electrons of the silver atoms is probabilistic, and is CORRECTLY predicted by the math of QM!

A. Neumaier · Sep 11, 2019

A. Neumaier said:

What is the probability distribution of something of which the information is given that 5 times someone observed 1, twice 3 was observed, and once 6 was observed? One cannot maximize the entroy given this information. But all information one can gather about aspects of the universe is information of this knd.

Elias1960 said:

One can. One starts with the state of zero additional information. This maximizes entropy for 1/6 for each number of the dice. Then one adds the information given and uses the formula for "Bayesian updating" to compute the probability distribution which takes into account that new information.

Did you ever apply what you recommend to others?

Elias1960 said:

Yes. See https://en.wikipedia.org/wiki/Bayesian_inference for the formulas.

Please tell us the updated probability distribution after having recorded the information described above.

A. Neumaier · Sep 11, 2019

romsofia said:

A measurement is simply the establishment of a correlation between a "system" observable an an "apparatus" observable.

?

Which correlation is established in the Stern-Gerlach experiment between a "system" observable an "apparatus" observable when you take a single measurement of a spin?

romsofia · Sep 11, 2019

A. Neumaier said:

?

Which correlation is established between a "system" observable an an "apparatus" observable when you take a single measurement of a photon?

OP is talking about the SG experiment, not measuring photons.

A. Neumaier · Sep 11, 2019

romsofia said:

OP is talking about the SG experiment, not measuring photons.

Yes, corrected.

romsofia · Sep 11, 2019

A. Neumaier said:

Yes, corrected.

I will outline the steps taken by Bryce DeWitt in his book (as it is an old book, and I don't think many members will have it off hand!): "Dynamical theory of groups and fields" (starting on page 16). So note, this is not my argument, but I believe his argument should be presented.

He argues that the mathematical form for analyzing a single observable "D" for a coupling between a system and apparatus is the total action functional: ##S+S_A+gxD## Where S is the action for the system, ##S_A## is the action for the apparatus, and gxD where: g is the (adjustable) coupling constant, x is some convenient apparatus variable, and D is the observable.

The observable in this case is the spin which we will refer to as the "system", and the "apparatus" will be referred to as: the atom (ignoring spin here), the magnetic field, and a coordinate framework.

The atom is massive compared to the spin, so the dynamical motion of the system S can be considered constant. The apparatus functional will take the form: ##S_A = \int \frac{1}{2} m(\dot{x_2}^2 +\dot{x_3}^2)dt##
Here ##(x_2, x_3)## are the apparatus coordinates in the plane, and we save the ##x_3## axis for the direction of the magnetic field. m is the mass of the atom. He makes the assumption that the atom will move in this plane, so he will ignore ##x_1##.

He then argues that the coupling term that correlates spin and atomic motion has the form (##\hbar = 1)##:
##\int \mu D H dt ##

If left undistrubed, the atom (essentially the apparatus) will follow the trajectory ##x_2 = vt, x_3=0## which is a stationary trajectory for the action ##S_A## He then argues, once again, if the atom is massive, it won't change much from this trajectory.

He then makes some assumptions about the strength of the magnetic field, and approximates it to:
##H = \theta(x_2) \theta(L-x_2) x_3(\frac{\partial H}{\partial x_3})|_{x_3=0}## Where L is the length of pole pieces of the magnet, and the minimum time experiment is to be: ##0 < t < \frac{L}{v}##

and ##\theta## is some step function defined by: ##\theta(e) = \frac{1}{2}(1+\frac{e}{|e|})## which are defined by: 1 for e>0, ##\frac{1}{2}## for e = 0, and 0 for e<0.

With these approximations in mind, he argues that the coupling term will reduce to the form gxD where:
##x = \int \theta(x_2) \theta(L-x_2) x_3 dt##, ##g = \mu (\frac{\partial H}{\partial x_3})|_{x_3=0}##

Later on in the book, he talks about elementary vs complete measurements, and adds more rigor to this arguments in the SG experiment, but I would rather just refer to the book at that point as it is several pages to build it up properly.

Does a measurement setup determine the reality of spin measurement outcomes?

Similar threads

Hot Threads

Recent Insights