- #1
- 10,877
- 423
I want to discuss the theorem proved in the article "Quantum states and generalized observables: a simple proof of Gleason's theorem" by P. Busch. http://arxiv.org/abs/quant-ph/9909073v3. I've been avoiding this article for some time because I thought it would require more knowledge of POVMs. I recently started reading about them, but I can't say that I understand them yet. It turned out that you don't need a lot of knowledge about POVMs.
I have written down my thoughts about the article below, but I'll start with my questions, so that you don't have to read the whole post just to find out what I want to ask.
The proof is easy, but it's difficult to understand both the assumptions that go into it and (especially) the author's conclusions.
The title appears to be seriously misleading. This isn't Gleason's theorem at all. Gleason's theorem is about finding all the probability measures on the lattice of subspaces of a Hilbert space, or equivalently, about finding all the probability measures on the lattice of projection operators on a Hilbert space. This theorem is about a larger partially ordered set that contains that lattice.
He calls that partially ordered set "the full set of effects ##\mathcal E(\mathcal H)##", but he doesn't define it in the article. There's also no clearly stated definition in the book he wrote ("Operational quantum physics") with two other guys (Grabowski and Lahti). The book starts by considering an experiment with a finite set of possible results ##\Omega=\{\omega_1,\dots,\omega_n\}##. (This is on pages 5-6). It denotes the probability of result ##\omega_i##, given state ##T##, by ##p(\omega_i|T)##, and says that the functional ##E_i## defined by ##E_i(T)=p(\omega_i|T)## is called an effect. Then it claims, without proof, that there's a sequence ##\langle E_i\rangle_i## of positive linear operators, such that ##\sum_i E_i=I## and ##E_i(T)=\operatorname{Tr}(TE_i)## for all i, and all states T. From this point on, the term "effect" refers to the operator ##E_i## that appears on the right, not the functional ##E_i## that appears on the left. This is certainly not an unambiguous definition of the term "effect".
Page 25 (of the book) comes closer to actually defining the term. It says that for each state T, the map ##B\mapsto\operatorname{Tr}(TB)## is a functional on the set of bounded linear operators, and that the requirement that the numbers ##\operatorname{Tr}(TB)## represent probabilities implies that B is positive and such that ##B\leq I## (meaning that ##I-B## is positive). The book claims that this conclusion is equivalent to this: The spectrum of any effect is a subset of [0,1]. (The book doesn't actually say that B is an effect, but I'm guessing that this is what the authors meant).
On the same page, the notation ##\mathcal E(\mathcal H)## is used for "the set of effects". They mention that it's a partially ordered set with a minimum element and a maximum element, but not a lattice. They also say that the set ##\mathcal E(\mathcal H)## is a convex subset of the set of bounded linear operators, and that its extremal elements are the projection operators.
So it appears that an effect is defined as a positive operator B such that ##B\leq I##, or equivalently as a bounded linear operator with a spectrum that's a subset of [0,1]. (Is it too much to ask that they actually say that somewhere? It's pretty frustrating to read texts like this). The proof in the article also mentions that there are positive operators that aren't in ##\mathcal E(\mathcal H)##.
The proof considers an arbitrary function ##\nu:\mathcal E(\mathcal H)\to[0,1]## that satisfies a number of conditions that are similar to the defining conditions of a probability measure on a lattice. I haven't verified it, but I suspect that if we had been dealing with the lattice of subspaces, then Busch's conditions would have been equivalent to those defining conditions. If I'm right, I think this explains the assumptions of the theorem.
The proof finds (easily) that the arbitrary function ##\nu## can be uniquely extended to a linear functional on the vector space of all positive operators. The proof says that this functional is "normal (due to σ-additivity)", and then claims that it's "well known" that any such functional is obtained from a density operator. (I guess Busch means that there's a density operator ##\rho## such that ##\nu(B)=\operatorname{Tr}(\rho B)## for all positive operators B). The article claims that this is proved in (lemma 1.6.1 of) "Quantum theory of open systems" by E.B. Davies, which I would have to go to a library to find, and also in von Neumann's book from 1932, which supposedly contains "a direct elementary proof". But it doesn't say where in the book. I spent 10-15 minutes looking for it, with no success.
The article then continues "The conclusion of our theorem is the same as that of Gleason's theorem". There's no explanation of what this means. I guess that it means that just like Gleason, he has found a bijection between the set of state operators and a set of generalized probability measures on a partially ordered set. If that's the case, then there's also a bijective correspondence between probability measures on the lattice of projectors and probability measures on the partially ordered set of effects.
I have written down my thoughts about the article below, but I'll start with my questions, so that you don't have to read the whole post just to find out what I want to ask.
- Is it correct to say that the article definitely doesn't contain a simple proof of Gleason's theorem?
- Is it correct to say that what this theorem does is to find all (generalized) probability measures on the partially ordered set ##\mathcal E(\mathcal H)##?
- Is there really a bijective correspondence between probability measures on ##\mathcal E(\mathcal H)## and probability measures on the lattice of projectors? (This would be the consequence if this theorem and Gleason's both establish a bijective correspondence with state operators).
- What is the definition of ##\mathcal E(\mathcal H)##? Is it the set of all bounded positive operators with a spectrum that's a subset of [0,1]?
- Why is ##\mathcal E(\mathcal H)## interesting? (As I said, I don't really understand this POVM stuff yet). To be more specific, why should we think of probability measures on ##\mathcal E(\mathcal H)## as "states". (OK, if they correspond bijectively to probability measures on the lattice of projectors, then that's a reason, but is there another one?)
- Suppose that ##\Omega=\{\omega_1,\dots,\omega_n\}## is the set of possible results of a measurement. Let's use the notation ##p(\omega_i|\rho)## for the probability of result ##\omega_i##, given state ##\rho##. The book (mentioned in my comments below) says that there are positive operators ##E_i## such that ##p(\omega_i|\rho)=\operatorname{Tr}(\rho E_i)##. How do you prove this? (This could perhaps help me understand the significance of these "effects").
- What does it mean for a linear functional to be "normal", and how do you prove that every normal linear functional on the vector space of positive bounded operators is of the form ##A\mapsto\operatorname{Tr}(\rho A)##, where ##\rho## is a state operator?
- How do you prove that the extremal elements of ##\mathcal E(\mathcal H)## are projection operators? (This is unrelated to the theorem, and perhaps a topic for another thread).
The proof is easy, but it's difficult to understand both the assumptions that go into it and (especially) the author's conclusions.
The title appears to be seriously misleading. This isn't Gleason's theorem at all. Gleason's theorem is about finding all the probability measures on the lattice of subspaces of a Hilbert space, or equivalently, about finding all the probability measures on the lattice of projection operators on a Hilbert space. This theorem is about a larger partially ordered set that contains that lattice.
He calls that partially ordered set "the full set of effects ##\mathcal E(\mathcal H)##", but he doesn't define it in the article. There's also no clearly stated definition in the book he wrote ("Operational quantum physics") with two other guys (Grabowski and Lahti). The book starts by considering an experiment with a finite set of possible results ##\Omega=\{\omega_1,\dots,\omega_n\}##. (This is on pages 5-6). It denotes the probability of result ##\omega_i##, given state ##T##, by ##p(\omega_i|T)##, and says that the functional ##E_i## defined by ##E_i(T)=p(\omega_i|T)## is called an effect. Then it claims, without proof, that there's a sequence ##\langle E_i\rangle_i## of positive linear operators, such that ##\sum_i E_i=I## and ##E_i(T)=\operatorname{Tr}(TE_i)## for all i, and all states T. From this point on, the term "effect" refers to the operator ##E_i## that appears on the right, not the functional ##E_i## that appears on the left. This is certainly not an unambiguous definition of the term "effect".
Page 25 (of the book) comes closer to actually defining the term. It says that for each state T, the map ##B\mapsto\operatorname{Tr}(TB)## is a functional on the set of bounded linear operators, and that the requirement that the numbers ##\operatorname{Tr}(TB)## represent probabilities implies that B is positive and such that ##B\leq I## (meaning that ##I-B## is positive). The book claims that this conclusion is equivalent to this: The spectrum of any effect is a subset of [0,1]. (The book doesn't actually say that B is an effect, but I'm guessing that this is what the authors meant).
On the same page, the notation ##\mathcal E(\mathcal H)## is used for "the set of effects". They mention that it's a partially ordered set with a minimum element and a maximum element, but not a lattice. They also say that the set ##\mathcal E(\mathcal H)## is a convex subset of the set of bounded linear operators, and that its extremal elements are the projection operators.
So it appears that an effect is defined as a positive operator B such that ##B\leq I##, or equivalently as a bounded linear operator with a spectrum that's a subset of [0,1]. (Is it too much to ask that they actually say that somewhere? It's pretty frustrating to read texts like this). The proof in the article also mentions that there are positive operators that aren't in ##\mathcal E(\mathcal H)##.
The proof considers an arbitrary function ##\nu:\mathcal E(\mathcal H)\to[0,1]## that satisfies a number of conditions that are similar to the defining conditions of a probability measure on a lattice. I haven't verified it, but I suspect that if we had been dealing with the lattice of subspaces, then Busch's conditions would have been equivalent to those defining conditions. If I'm right, I think this explains the assumptions of the theorem.
The proof finds (easily) that the arbitrary function ##\nu## can be uniquely extended to a linear functional on the vector space of all positive operators. The proof says that this functional is "normal (due to σ-additivity)", and then claims that it's "well known" that any such functional is obtained from a density operator. (I guess Busch means that there's a density operator ##\rho## such that ##\nu(B)=\operatorname{Tr}(\rho B)## for all positive operators B). The article claims that this is proved in (lemma 1.6.1 of) "Quantum theory of open systems" by E.B. Davies, which I would have to go to a library to find, and also in von Neumann's book from 1932, which supposedly contains "a direct elementary proof". But it doesn't say where in the book. I spent 10-15 minutes looking for it, with no success.
The article then continues "The conclusion of our theorem is the same as that of Gleason's theorem". There's no explanation of what this means. I guess that it means that just like Gleason, he has found a bijection between the set of state operators and a set of generalized probability measures on a partially ordered set. If that's the case, then there's also a bijective correspondence between probability measures on the lattice of projectors and probability measures on the partially ordered set of effects.
Last edited: