Does Busch's Theorem Offer a Simplified Proof of Gleason's Theorem?

Fredrik · Jun 15, 2014

I want to discuss the theorem proved in the article "Quantum states and generalized observables: a simple proof of Gleason's theorem" by P. Busch. http://arxiv.org/abs/quant-ph/9909073v3. I've been avoiding this article for some time because I thought it would require more knowledge of POVMs. I recently started reading about them, but I can't say that I understand them yet. It turned out that you don't need a lot of knowledge about POVMs.

I have written down my thoughts about the article below, but I'll start with my questions, so that you don't have to read the whole post just to find out what I want to ask.

Is it correct to say that the article definitely doesn't contain a simple proof of Gleason's theorem?
Is it correct to say that what this theorem does is to find all (generalized) probability measures on the partially ordered set ##\mathcal E(\mathcal H)##?
Is there really a bijective correspondence between probability measures on ##\mathcal E(\mathcal H)## and probability measures on the lattice of projectors? (This would be the consequence if this theorem and Gleason's both establish a bijective correspondence with state operators).
What is the definition of ##\mathcal E(\mathcal H)##? Is it the set of all bounded positive operators with a spectrum that's a subset of [0,1]?
Why is ##\mathcal E(\mathcal H)## interesting? (As I said, I don't really understand this POVM stuff yet). To be more specific, why should we think of probability measures on ##\mathcal E(\mathcal H)## as "states". (OK, if they correspond bijectively to probability measures on the lattice of projectors, then that's a reason, but is there another one?)
Suppose that ##\Omega=\{\omega_1,\dots,\omega_n\}## is the set of possible results of a measurement. Let's use the notation ##p(\omega_i|\rho)## for the probability of result ##\omega_i##, given state ##\rho##. The book (mentioned in my comments below) says that there are positive operators ##E_i## such that ##p(\omega_i|\rho)=\operatorname{Tr}(\rho E_i)##. How do you prove this? (This could perhaps help me understand the significance of these "effects").
What does it mean for a linear functional to be "normal", and how do you prove that every normal linear functional on the vector space of positive bounded operators is of the form ##A\mapsto\operatorname{Tr}(\rho A)##, where ##\rho## is a state operator?
How do you prove that the extremal elements of ##\mathcal E(\mathcal H)## are projection operators? (This is unrelated to the theorem, and perhaps a topic for another thread).

These are the thoughts I wrote down to get things straight in my head, and perhaps make it easier to answer my questions:

The proof is easy, but it's difficult to understand both the assumptions that go into it and (especially) the author's conclusions.

The title appears to be seriously misleading. This isn't Gleason's theorem at all. Gleason's theorem is about finding all the probability measures on the lattice of subspaces of a Hilbert space, or equivalently, about finding all the probability measures on the lattice of projection operators on a Hilbert space. This theorem is about a larger partially ordered set that contains that lattice.

He calls that partially ordered set "the full set of effects ##\mathcal E(\mathcal H)##", but he doesn't define it in the article. There's also no clearly stated definition in the book he wrote ("Operational quantum physics") with two other guys (Grabowski and Lahti). The book starts by considering an experiment with a finite set of possible results ##\Omega=\{\omega_1,\dots,\omega_n\}##. (This is on pages 5-6). It denotes the probability of result ##\omega_i##, given state ##T##, by ##p(\omega_i|T)##, and says that the functional ##E_i## defined by ##E_i(T)=p(\omega_i|T)## is called an effect. Then it claims, without proof, that there's a sequence ##\langle E_i\rangle_i## of positive linear operators, such that ##\sum_i E_i=I## and ##E_i(T)=\operatorname{Tr}(TE_i)## for all i, and all states T. From this point on, the term "effect" refers to the operator ##E_i## that appears on the right, not the functional ##E_i## that appears on the left. This is certainly not an unambiguous definition of the term "effect".

Page 25 (of the book) comes closer to actually defining the term. It says that for each state T, the map ##B\mapsto\operatorname{Tr}(TB)## is a functional on the set of bounded linear operators, and that the requirement that the numbers ##\operatorname{Tr}(TB)## represent probabilities implies that B is positive and such that ##B\leq I## (meaning that ##I-B## is positive). The book claims that this conclusion is equivalent to this: The spectrum of any effect is a subset of [0,1]. (The book doesn't actually say that B is an effect, but I'm guessing that this is what the authors meant).

On the same page, the notation ##\mathcal E(\mathcal H)## is used for "the set of effects". They mention that it's a partially ordered set with a minimum element and a maximum element, but not a lattice. They also say that the set ##\mathcal E(\mathcal H)## is a convex subset of the set of bounded linear operators, and that its extremal elements are the projection operators.

So it appears that an effect is defined as a positive operator B such that ##B\leq I##, or equivalently as a bounded linear operator with a spectrum that's a subset of [0,1]. (Is it too much to ask that they actually say that somewhere? It's pretty frustrating to read texts like this). The proof in the article also mentions that there are positive operators that aren't in ##\mathcal E(\mathcal H)##.

The proof considers an arbitrary function ##\nu:\mathcal E(\mathcal H)\to[0,1]## that satisfies a number of conditions that are similar to the defining conditions of a probability measure on a lattice. I haven't verified it, but I suspect that if we had been dealing with the lattice of subspaces, then Busch's conditions would have been equivalent to those defining conditions. If I'm right, I think this explains the assumptions of the theorem.

The proof finds (easily) that the arbitrary function ##\nu## can be uniquely extended to a linear functional on the vector space of all positive operators. The proof says that this functional is "normal (due to σ-additivity)", and then claims that it's "well known" that any such functional is obtained from a density operator. (I guess Busch means that there's a density operator ##\rho## such that ##\nu(B)=\operatorname{Tr}(\rho B)## for all positive operators B). The article claims that this is proved in (lemma 1.6.1 of) "Quantum theory of open systems" by E.B. Davies, which I would have to go to a library to find, and also in von Neumann's book from 1932, which supposedly contains "a direct elementary proof". But it doesn't say where in the book. I spent 10-15 minutes looking for it, with no success.

The article then continues "The conclusion of our theorem is the same as that of Gleason's theorem". There's no explanation of what this means. I guess that it means that just like Gleason, he has found a bijection between the set of state operators and a set of generalized probability measures on a partially ordered set. If that's the case, then there's also a bijective correspondence between probability measures on the lattice of projectors and probability measures on the partially ordered set of effects.

micromass · Jun 15, 2014

Fredrik said:

[*] What is the definition of ##\mathcal E(\mathcal H)##? Is it the set of all bounded positive operators with a spectrum that's a subset of [0,1]?

Yes, this is equivalent.

[*] What does it mean for a linear functional to be "normal", and how do you prove that every normal linear functional on the vector space of positive bounded operators is of the form ##A\mapsto\operatorname{Tr}(\rho A)##, where ##\rho## is a state operator?

I severely dislike the terminology in the article. He seems to be deal with a linear functional ##F:\mathcal{B}(H)\rightarrow \mathbb{C}##. It is much better to use the C*-algebra formalism here, since ##\mathcal{B}(H)## is a C*-algebra is a canonical way. So, what we are dealing with are linear functionals ##\tau:A\rightarrow \mathbb{C}## (where ##A## is a C*-algebra with unit) that is positive and normalized. This means that ##\tau(a*a)\geq 0## and ##\tau(1) = 1##. It will not come as a surprise that C*-algebraists call such a function a state on the C*-algebra. See http://en.wikipedia.org/wiki/State_(functional_analysis ) See also the section "properties of states" for the definition of a "normal state", and this is what I guess is meant with normal.

The normal states on ##\mathcal{B}(H)## are exactly those of the form ##\tau(A) = \mathrm{tr}(HA)##. The proof can be found in Kadison & Ringrose page 462 but doesn't seem very elementary. If you wish, I can try to find an elementary proof for you.

micromass · Jun 15, 2014

Fredrik said:

So it appears that an effect is defined as a positive operator B such that ##B\leq I##, or equivalently as a bounded linear operator with a spectrum that's a subset of [0,1]. (Is it too much to ask that they actually say that somewhere? It's pretty frustrating to read texts like this).

They actually do say it in footnote [1]

Fredrik · Jun 15, 2014

Thanks micromass. I will take a look at the proof in Kadison & Ringrose.

Fredrik · Jun 15, 2014

I've had a first look at the proof in K & R. I think I understand that (e) is our assumption and that (a) is the result we want. The implication (e) → (f) looks simple enough, but (f) → (a) could be a problem. They say that this is the content of theorem 7.1.9. The proof of 7.1.9 immediately refers to theorem 7.1.8. The proof of 7.1.8 refers to at least five different numbered theorems. This could be pretty difficult to sort out. On the other hand, I might want to learn some of these things anyway. I will take a break now and take a look at those theorems later.

micromass · Jun 15, 2014

Fredrik said:

How do you prove that the extremal elements of ##\mathcal E(\mathcal H)## are projection operators? (This is unrelated to the theorem, and perhaps a topic for another thread).

Let us immediately do this in a more general situation since it is easier. So let ##A## be a unital C*-algebra. We define ##\mathcal{E}(A)## as the set of all hermitian elements ##a## of ##A## such that ##0\leq a \leq 1##. Recall that a projection in ##A## is a ##p## such that ##p^* = p = p^2##.

First, assume that ##A## is abelian. Due to the Gelfand-Naimark theorem, it has the form ##\mathcal{C}(X)## for ##X## a compact topological Hausdorff space. In this case ##\mathcal{E}(A)## are all the functions ##f:X\rightarrow [0,1]##. I don't think it's very difficult to prove that ##f## must be an indicator function. Thus ##f## is a projection.

Now, the general case. Let ##p## be a projection and let ##p=(a+b)/2##. Then ##b/2 = p - a/2 \leq p##. This implies that ##b## and ##p## commute (see lemma later). But then ##p##, ##a## and ##b## commute. So we can look at the C*-algebra generated by ##p,~a,~b,~1##. From the abelian case, we see that ##p=a=b##. Thus ##p## is an extreme point.

Conversely, let ##a## be an extreme point. Take ##B## the C*-algebra generated by ##a## and ##1##, this is a unital abelian C*-algebra and ##a## is still an extreme point of ##\mathcal{E}(B)##. Thus ##a## is a projection by the abelian case.

Lemma: If ##p## is a projection and if ##0\leq a\leq p##, then ##pa = ap = p##.
Indeed, since from ##0\leq a\leq p## follows that for each ##c\in A##, we have ##0\leq cac* \leq cpc^*##. In particular, we have ##0\leq (1-p)a(1-p)\leq (1-p)p(1-p) = 0##. Thus ##0 = (1-p)a(1-p)##. But then by the C*-identity, we have
[tex]\|a^{1/2}(1-p)\|^2 = \|(1-p)a(1-p)\| = 0[/tex]
Thus ##a^{1/2} = a^{1/2}p## and thus ##a = ap##. By taking adjoint, we get ##a^* = p^* a^*## and thus ##a = pa##.

naima · Jun 15, 2014

Another paper on the subject
With small differences.

micromass · Jun 15, 2014

Also see this file http://wolfweb.unr.edu/homepage/bruceb/Cycr.pdf page 261. I think this gives a shorter proof than K&R.

bhobba · Jun 15, 2014

I have been mucking around with that theorem for a while now and have come up with my own slightly simplified proof. My comments will be to that proof rather than the one in the article - its essentially the same though.

Fredrik said:

Is it correct to say that the article definitely doesn't contain a simple proof of Gleason's theorem?

Its a Gleason like theorem based on the stronger assumption of POVM's rather than resolutions of the identity (ROI) ie Von Neumann measurements. But in modern times it is recognised that Von Neumann measurements are not the most general kind of measurement so an axiomatic treatment can start with POVM's rather than resolutions of the identity. In fact that's my personal preferred path. In fact one can, by means of a bit of physical insight and Neumarks theorem derive POVM's from ROI's. I used to view it that way but don't any more, and simply take POVM's as the starting point.

Fredrik said:

Is it correct to say that what this theorem does is to find all (generalized) probability measures on the partially ordered set ##\mathcal E(\mathcal H)##?

It shows, from the assumption of non contextuality and the strong principle of superposition, then the only probability measure that can be defined on a POVM is via the Born rule. Partial ordering isn't required. Of course the real key assumption is non-contextuality.

I think the rest of the questions can best be answered if I post up my proof and we can pull it to pieces.

It will take me a little while though.

Thanks
Bill

bhobba · Jun 15, 2014

OK guys here is the proof I came up with.

Just for completeness let's define a POVM. A POVM is a set of positive operators Ei ∑ Ei =1 from, for the purposes of QM, an assumed complex vector space.

Elements of POVM's are called effects and its easy to see a positive operator E is an effect iff Trace(E) <= 1.

First let's start with the foundational axiom the proof uses as its starting point.

An observation/measurement with possible outcomes i = 1, 2, 3 ... is described by a POVM Ei such that the probability of outcome i is determined by Ei, and only by Ei, in particular it does not depend on what POVM it is part of.

Only by Ei means regardless of what POVM the Ei belongs to the probability is the same. This is the assumption of non contextuality and is the well known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle any POVM corresponds to an observation/measurement.

I will let f(Ei) be the probability of Ei. Obviously f(I) = 1 from the law of total probability. Since I + 0 = I f(0) = 0.

First additivity of the measure for effects.

Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)

Next linearity wrt the rationals - its the usual standard argument from additivity from linear algebra but will repeat it anyway.

f(E) = f(n E/n) = f(E/n + ... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.

Will extend the definition to any positive operator E. If E is a positive operator a n and an effect E1 exists E = n E1 as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).

From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).

Now we want to show continuity to show true for real's.

If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rational's whose limit is the irrational number c. Let r2n be a decreasing sequence of rational's whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).

Extending it to any Hermitian operator H.

H can be broken down to H = E1 - E2 where E1 and E2 are positive operators by for example separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually there was no need to show uniqueness because I could have defined E1 and E2 to be the positive operators from separating the eigenvalues, but what the heck - its not hard to show uniqueness.

Its easy to show linearity wrt to the real's under this extended definition.

Its pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again its easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity that the forgoing extensions of f have led to.

f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)

Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.

Hence a positive operatotor of unit trace P exists such that the probability of Ei occurring in the POVM E1, E2 ... is Trace (Ei P).

Whew. Glad that's over with.

Now its out there we can pull it to pieces and see exactly what's going on.

Thanks
Bill

Fredrik · Jun 15, 2014

bhobba said:

Now its out there we can pull it to pieces and see exactly what's going on.

I'll be happy to assist, since this is a topic that interests me a lot, but it could take some time, since I'm also looking at the stuff I'm discussing with micromass, in order to fill the gaps in Busch's proof.

That book by Blackadar is really nice. (Link in micro's post). I think I'm going to have to read a big part of it thoroughly. Right now I have to go to bed, but I'll see what I can do tomorrow.

bhobba · Jun 15, 2014

Fredrik said:

I'll be happy to assist, since this is a topic that interests me a lot, but it could take some time, since I'm also looking at the stuff I'm discussing with micromass, in order to fill the gaps in Busch's proof.

Hopefully my proof has no gaps. I have seen a number of proofs and picked the eyes out of them so to speak to get the most elegant one.

It interests me as well because it leads to a very elegant axiomatic treatment of QM. Basically the two axioms used in my favourite QM book, Ballentine is now just one. Very very elegant.

Thanks
Bill

naima · Jun 16, 2014

bhobba said:

Now its out there we can pull it to pieces and see exactly what's going on.

Busch writes that this is not enough for d=2.
Could you explain why?

bhobba · Jun 16, 2014

naima said:

Busch writes that this is not enough for d=2.
Could you explain why?

It's not enough for D=2 in Gleason's usual proof based on resolutions of the identity. But no such restriction exists for the proof based on POVM's - which is one of its advantages. You can check the proof yourself and see no such restriction is required.

In fact he states exactly that - from his paper 'The statement of the present theorem also extends to the case of 2-dimensional Hilbert spaces where Gleason’s theorem fails.'

Thanks
Bill

Fredrik · Jun 17, 2014

micromass said:

Let us immediately do this in a more general situation since it is easier. So let ##A## be a unital C*-algebra. We define ##\mathcal{E}(A)## as the set of all hermitian elements ##a## of ##A## such that ##0\leq a \leq 1##. Recall that a projection in ##A## is a ##p## such that ##p^* = p = p^2##.

First, assume that ##A## is abelian. Due to the Gelfand-Naimark theorem, it has the form ##\mathcal{C}(X)## for ##X## a compact topological Hausdorff space. In this case ##\mathcal{E}(A)## are all the functions ##f:X\rightarrow [0,1]##. I don't think it's very difficult to prove that ##f## must be an indicator function. Thus ##f## is a projection.

Now, the general case. Let ##p## be a projection and let ##p=(a+b)/2##. Then ##b/2 = p - a/2 \leq p##. This implies that ##b## and ##p## commute (see lemma later). But then ##p##, ##a## and ##b## commute. So we can look at the C*-algebra generated by ##p,~a,~b,~1##. From the abelian case, we see that ##p=a=b##. Thus ##p## is an extreme point.

Conversely, let ##a## be an extreme point. Take ##B## the C*-algebra generated by ##a## and ##1##, this is a unital abelian C*-algebra and ##a## is still an extreme point of ##\mathcal{E}(B)##. Thus ##a## is a projection by the abelian case.

Lemma: If ##p## is a projection and if ##0\leq a\leq p##, then ##pa = ap = p##.
Indeed, since from ##0\leq a\leq p## follows that for each ##c\in A##, we have ##0\leq cac* \leq cpc^*##. In particular, we have ##0\leq (1-p)a(1-p)\leq (1-p)p(1-p) = 0##. Thus ##0 = (1-p)a(1-p)##. But then by the C*-identity, we have
[tex]\|a^{1/2}(1-p)\|^2 = \|(1-p)a(1-p)\| = 0[/tex]
Thus ##a^{1/2} = a^{1/2}p## and thus ##a = ap##. By taking adjoint, we get ##a^* = p^* a^*## and thus ##a = pa##.

It took me some time to refresh my memory about Gelfand transforms and that kind of stuff, but I think I understand this now, except for a detail that looks simple: ##0\leq a\leq p## implies ##0\leq cac^*\leq cpc^*##. I can prove this easily if I can prove that the product of two positive operators is positive, so I tried to prove that. (I thought incorrectly that you had assumed that ##c\geq 0##). After some time of failing to do that, I did a google search for "product of positive operators". What I found only made me suspect that there's no such theorem.

I tried to find this result in Blackadar, but the theorem I found assumes that the operators commute. So maybe it just isn't true. In that case, I don't see why the implication should hold for all ##c\in A##.

Take your time. I still have a lot of other things to look at, in particular the proof (either Kadison & Ringrose or Blackadar) of the theorem about states, and bhobba's long post.

micromass · Jun 17, 2014

Fredrik said:

It took me some time to refresh my memory about Gelfand transforms and that kind of stuff, but I think I understand this now, except for a detail that looks simple: ##0\leq a\leq p## implies ##0\leq cac^*\leq cpc^*##. I can prove this easily if I can prove that the product of two positive operators is positive, so I tried to prove that. (I thought incorrectly that you had assumed that ##c\geq 0##). After some time of failing to do that, I did a google search for "product of positive operators". What I found only made me suspect that there's no such theorem.

I tried to find this result in Blackadar, but the theorem I found assumes that the operators commute. So maybe it just isn't true. In that case, I don't see why the implication should hold for all ##c\in A##.

Take your time. I still have a lot of other things to look at, in particular the proof (either Kadison & Ringrose or Blackadar) of the theorem about states, and bhobba's long post.

I guess you have defined a positive element ##a## as being self-adjoint and having positive spectrum? Or maybe you are only talking about operators on a Hilbert space and then you defined an operator ##A## to be positive if it is self-adjoint and ##<Ax,x>>0## for each ##x##?

Both are fine definitions, but one can prove the following highly nontrivial theorem:

THEOREM: A element ##a## in a ##C^*##-algebra is positive if and only if there exist a ##d## in the ##C^*##-algebra such that ##a=d^*d##.

The proof of this theorem utilizes the Gelfand transform again. See Murphy's "C*-algebras and operator theory". Theorem 2.3.5 gives the equivalence between the operator version of positive and the C*-algebra version of positive. Theorem 2.2.4 proves the above theorem.

The result I used then is that if ##a,~b## are self-adjoint, if ##c## is arbitrary and if ##a\leq b##, then ##c^*ac \leq c^*bc##. Using the theorem, this is now trivial. Indeed, we know that ##b-a\geq 0## and thus there exists a ##d## such that ##b-a = d^*d##. Multiplying by ##c##we get ##c^*bc - c^*ac = c^*d^*dc = (dc)^*dc\geq 0##. Thus ##c^*ac\leq c^*bc##.

micromass · Jun 18, 2014

Fredrik said:

[*]Why is ##\mathcal E(\mathcal H)## interesting? (As I said, I don't really understand this POVM stuff yet). To be more specific, why should we think of probability measures on ##\mathcal E(\mathcal H)## as "states". (OK, if they correspond bijectively to probability measures on the lattice of projectors, then that's a reason, but is there another one?)

Now, I don't know much QM, so take this with a grain of salt. But whenever I see concepts like that, I always like to compare it with the commutative situation. In that situation, everything should work classically and we should get actual probability measures in the classical sense.

Indeed, if we work commutative, then we work in a space ##\mathcal{C}(X)## of continuous functions on some (compact) Hausdorff space.
What are the states on this algebra? They are by definition bounded linear functionals ##\tau:\mathcal{C}(X)\rightarrow \mathbb{C}## such that ##\tau(f)>0## if ##f>0## and ##\tau(1)=1##. It turns out that every probability measure ##\mu## on ##X## (if ##X## is nice enough) determines a state, indeed, we set ##\tau(f) = \int_X fd\mu##. The converse is also true. This is a theorem by Riesz, Markov and Kakutani: http://en.wikipedia.org/wiki/Riesz–Markov–Kakutani_representation_theorem

The space ##\mathcal{P}(\mathcal{H})## corresponds here to usual functions ##f:X\rightarrow \{0,1\}## which are continuous. So the projections are just continuous indicator functions.
But if ##X = [0,1]## (for example), then we only have two projections since ##X## is connected. So the probability measures on the projections don't really show us all the probability measures on ##X##, so don't give us all the states.

The space ##\mathcal{E}(\mathcal{H})## corresponds here to functions ##f:X\rightarrow [0,1]##. Probability measures on such function should know correspond to states on the entire algebra. I haven't proved it, but it seems reasonable since every function ##f\in \mathcal{C}(X)## can be decomposed as ##f = f^+ - f^-## and the ##f^+## and ##f^-## can be rescaled to be an effect.

Also, the fact that there is no bijection between the probability measures on the projections and the probability measuress on the effects in this case, might indicate that the answer to your question (3) is no. But of course, in (3) we are dealing with an entire different C*-algebra!

micromass · Jun 18, 2014

micromass said:

Also, the fact that there is no bijection between the probability measures on the projections and the probability measuress on the effects in this case, might indicate that the answer to your question (3) is no. But of course, in (3) we are dealing with an entire different C*-algebra!

Because of the close connections to Von Neumann algebras, I realized that it is probably better to look at abelian Von Neumann algebras instead of abelian C*-algebras for the classical situation. The difference between these two can be big since a Von Neumann algebra always has many projections.

So what is an abelian Von Neumann algebra? We can prove it is always the same as ##L^\infty(X,\mu)##, so the a.e. bounded functions on some measure space. The states should again have the form ##f\rightarrow \int_X fd\mu## for ##\mu## probability measures, but I can't seem to find a reference for it. I can try to prove it if you want.

The projections are now all measurable indicator functions on ##X##. This is a much better situation since this is the same as the ##\sigma##-algebra on ##X##. Thus the probability measures on this do in fact correspond to the states.

The effects are now the measurable functions ##f:X\rightarrow [0,1]##. The same argument as my previous post should show that we indeed get that a probability measure on this is the same as a state.

micromass · Jun 18, 2014

Fredrik said:

Is there really a bijective correspondence between probability measures on ##\mathcal E(\mathcal H)## and probability measures on the lattice of projectors? (This would be the consequence if this theorem and Gleason's both establish a bijective correspondence with state operators).

My previous post on Abelian Von Neumann algebra's suggested that this probably true since a Von Neumann algebra has a lot of projections. In particular, it can be shown that a Von Neumann algebra is the closure of the linear span of its projections. This is a consequence of the spectral theorem.

In particular, given a probability measure ##\nu## on the lattice of projectors and given a effect ##A##, we can then write ##A## as a limit of linear combinations of projections. So ##\sum\alpha_i P_i \rightarrow A##. I think it should then definitely be possible to define ##\nu(A)## as the limit of ##\sum\alpha_i \nu(P_i)##.

Another possibility is to take the spectral measures for ##A##. These are projections ##E(S)## for every measurable set ##S##. Taking ##\nu(E(S))## then defines some kind of probability measure on the sets ##S##. Since ##A = \int xdE##, it might not be a bad idea to define ##\nu(A) = \int xd\nu(E)##.

Fredrik · Jun 18, 2014

micromass said:

I guess you have defined a positive element ##a## as being self-adjoint and having positive spectrum? Or maybe you are only talking about operators on a Hilbert space and then you defined an operator ##A## to be positive if it is self-adjoint and ##<Ax,x>>0## for each ##x##?

I tried to use both of these definitions.

micromass said:

Both are fine definitions, but one can prove the following highly nontrivial theorem:

THEOREM: A element ##a## in a ##C^*##-algebra is positive if and only if there exist a ##d## in the ##C^*##-algebra such that ##a=d^*d##.

The proof of this theorem utilizes the Gelfand transform again. See Murphy's "C*-algebras and operator theory". Theorem 2.3.5 gives the equivalence between the operator version of positive and the C*-algebra version of positive. Theorem 2.2.4 proves the above theorem.

I thought I would be able to prove this without looking at Murphy (because I just read about similar things in Sunder), but I had to assume that d is normal to complete my proof. I see that Murphy's 2.2.4 proves that ##d^*d## is positive without that assumption. The proof is something that I wouldn't have come up with in a very long time. I will have to study it carefully.

micromass said:

The result I used then is that if ##a,~b## are self-adjoint, if ##c## is arbitrary and if ##a\leq b##, then ##c^*ac \leq c^*bc##. Using the theorem, this is now trivial. Indeed, we know that ##b-a\geq 0## and thus there exists a ##d## such that ##b-a = d^*d##. Multiplying by ##c##we get ##c^*bc - c^*ac = c^*d^*dc = (dc)^*dc\geq 0##. Thus ##c^*ac\leq c^*bc##.

Crystal clear. Thanks.

micromass · Jun 18, 2014

Fredrik said:

I thought I would be able to prove this without looking at Murphy (because I just read about similar things in Sunder), but I had to assume that d is normal to complete my proof. I see that Murphy's 2.2.4 proves that ##d^*d## is positive without that assumption. The proof is something that I wouldn't have come up with in a very long time. I will have to study it carefully.

Like I said, it is highly nontrivial. It took the great operator algebraist Kaplansky to show this.

Fredrik · Jun 22, 2014

Sorry about being so slow. I haven't abandoned the thread. I decided that I want to know C*-algebras a little bit better before I continue. I expect to spend between 1 and 3 more days on that before I return here.

naima · Jul 1, 2014

bhobba said:

Its pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again its easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Take O = i * Id
f gives the probabikity of an operator. What would it be for i * Id?

bhobba · Jul 1, 2014

naima said:

Take O = i * Id f gives the probabikity of an operator. What would it be for i * Id?

Obviously i.

But remember I am extending f. Beyond effects its not interpreted as probability.

Thanks
Bill

naima · Jul 2, 2014

Why have you to extend f for this theorem ?. You add a new axiom.

bhobba · Jul 2, 2014

naima said:

Why have you to extend f for this theorem ?. You add a new axiom.

You do not add anything doing that. Its defined for effects. Defining it any way you like for any other operator violates no principle of logic. However only the way I defined it leads to continuity and linearity.

Thanks
Bill

naima · Jul 18, 2014

The theorem to be proved is:
Any generalized probability measure
v(E) onthe set ##\mathscr E (H)## of effects with the properties
P1) 0 ≤ v(E) ≤ 1 for all E;
(P2) v(I) = 1;
(P3) v(E + F + . . .) = v(E) + v(F ) + . . . for any sequence E, F, . . . with
E + F + . . . ≤ I.
is of the form v(E) = tr[ρE] for all E, for some density operator
ρ.
The authors of the paper have shown that this enables us to extend v to a
R-linear map on the set ##\mathscr E^\cup (H)## = ##\mathscr E (H) + \mathscr E^{op} (H)## where ##\mathscr E^{op} (H)## is the set of the opposites of ##\mathscr E (H)##
Say H is d dimentinal.
We can construct ##d^2## elementts of ##\mathscr E^\cup (H)## out of an orthonormal basis of H :
d projectors: |j >< j|
d(d-1)/2 operators: sym(j,k) = ## \frac{|j><k| + |k><j|}{\sqrt 2}##
d(d-1)/2 other operators: asym(j,k) = ## i * \frac{|j><k| - |k><j|}{\sqrt 2}## with ##(i^2 = -1)##
v maps each of them in [-1 1]
We will have to prove that ## \rho = \Sigma v(|j >< j| + \Sigma v(\frac{|j><k| + |k><j|}{\sqrt 2} + \Sigma v(i * \frac{|j><k| - |k><j|}{\sqrt 2})## is the density matrix we are looking for.

a) Hermicity
As a linear combination (with real coefficients) of hermitiann operators ##\rho## is hermitian
b) Trace
sym(j,k) and asym(j,k) are traceless so Tr(##\rho##) is the trace of the first d terms. As |j><j| sum to Id ([P2) shows that the tace is 1.
b) Positivity
There is an othonormal basis B on which this hermitian matrix is diagonal
For each vector I of B, v(|I >< I|) is on the diagonal of ##\rho##
As |I >< I| is a projector it is an effect and v maps it into [0 1]

As it is an non negative self adjoint unit trace operator it is a density matrix.

An effect E may be seen as a vector in the ##d^2## dimensional base with coefficients in [-1 1] such that
v(E) = Tr(E ##\rho##)

I have avoided to use complex coefficients. complex numbers only appear in the SU(N) generators of the effects.

bhobba · Jul 18, 2014

naima said:

The authors of the paper

What paper is that?

Thanks
Bill

naima · Jul 18, 2014

Look at Frederik's first thread
http://arxiv.org/abs/quant-ph/9909073v3

naima · Jul 21, 2014

micromass said:

I severely dislike the terminology in the article. He seems to be deal with a linear functional ##F:\mathcal{B}(H)\rightarrow \mathbb{C}##.

v maps effects to [0 1]
Busch extends the definition to : ##v:\mathcal{B}(H)\rightarrow [-1 1]##
And that is all! there is no map to ##\mathbb{C}##

I have nothing against C* algebras (C is not for complex but for closed!)

I gave the end of the proof only for finite dimensional Hilbert spaces.
Can it be extended to general Hilbert spaces?

Another question:
I am accustomed to density operators of pure states but not to effects. having V you have a density operator.
What are the probablity measures that give pure state density operators?

Fredrik · Jul 28, 2014

Sorry about abandoning this thread for so long. I took some time to learn a bit more about C*-algebras, and it took me longer than expected. First I had to refresh my memory about the things I had studied before, and then my parents came to visit for several weeks. I didn't get much done during that time.

I understand most of the C*-algebra stuff that I need now, e.g. why ##x^*x\geq 0## for all x, and why the projections in a unital C*-algebra ##\mathcal A## are the extreme points of the set ##\mathcal E(\mathcal A)=\{x\in\mathcal A|0\leq x\leq 1\}=\{x\in\mathcal A|\sigma(x)\subseteq[0,1]\}## of effects in ##\mathcal A##. But I still don't understand the converse of that last statement, i.e. why an arbitrary extreme point of ##\mathcal E(\mathcal A)## must be a projection. I don't even see how to prove it when ##\mathcal A## is commutative. I feel like it should be easy, so maybe I'm just missing something simple.

I still haven't studied the theorem about linear functionals on von Neumann algebras (theorem III.2.1.4 in Blackadar, p. 262, or theorem 7.1.12 in Kadison & Ringrose, p. 462). I may have to read more about the basics of von Neumann algebras to get through it, so this could take a while. I'll get started right away. I will get to the other things discussed in this thread as soon as I can.

micromass · Jul 29, 2014

Fredrik said:

Sorry about abandoning this thread for so long. I took some time to learn a bit more about C*-algebras, and it took me longer than expected. First I had to refresh my memory about the things I had studied before, and then my parents came to visit for several weeks. I didn't get much done during that time.

I understand most of the C*-algebra stuff that I need now, e.g. why ##x^*x\geq 0## for all x, and why the projections in a unital C*-algebra ##\mathcal A## are the extreme points of the set ##\mathcal E(\mathcal A)=\{x\in\mathcal A|0\leq x\leq 1\}=\{x\in\mathcal A|\sigma(x)\subseteq[0,1]\}## of effects in ##\mathcal A##. But I still don't understand the converse of that last statement, i.e. why an arbitrary extreme point of ##\mathcal E(\mathcal A)## must be a projection. I don't even see how to prove it when ##\mathcal A## is commutative. I feel like it should be easy, so maybe I'm just missing something simple.

Assume that ##f## is an extreme point of the effects of the commutative algebra ##\mathcal{C}(X)## (with ##X## a compact topological space). Assume that there is an ##x\in X## such that ##0<f(x)<1##. Then there must be an entire open set ##U\subseteq X## such that ##0<f(x)<1## for each ##x\in U##. By a partition of unity argument, we can find a continuous function ##g## such that ##g(x) = 0## for ##x\notin U## and such that ##0\leq f(x)\pm g(x)\leq 1## on ##U##. Thus ##f(x)\pm g(x)## is an effect and we can write ##f(x) = \frac{1}{2}(f(x) + g(x)) + \frac{1}{2}(f(x) - g(x))##. Thus ##f## is not an extreme point in this case.

So this proves that if ##f## is an extreme point that ##f(x) = 0## or ##f(x)=1##, which happens exactly if ##f## is a projection.

I still haven't studied the theorem about linear functionals on von Neumann algebras (theorem III.2.1.4 in Blackadar, p. 262, or theorem 7.1.12 in Kadison & Ringrose, p. 462). I may have to read more about the basics of von Neumann algebras to get through it, so this could take a while. I'll get started right away. I will get to the other things discussed in this thread as soon as I can.

I think an easier proof can be found in Conway's "A course in operator theory", Theorem 46.4.

Fredrik · Jul 29, 2014

micromass said:

Assume that ##f## is an extreme point of the effects of the commutative algebra ##\mathcal{C}(X)## (with ##X## a compact topological space). Assume that there is an ##x\in X## such that ##0<f(x)<1##. Then there must be an entire open set ##U\subseteq X## such that ##0<f(x)<1## for each ##x\in U##. By a partition of unity argument, we can find a continuous function ##g## such that ##g(x) = 0## for ##x\notin U## and such that ##0\leq f(x)\pm g(x)\leq 1## on ##U##. Thus ##f(x)\pm g(x)## is an effect and we can write ##f(x) = \frac{1}{2}(f(x) + g(x)) + \frac{1}{2}(f(x) - g(x))##. Thus ##f## is not an extreme point in this case.

So this proves that if ##f## is an extreme point that ##f(x) = 0## or ##f(x)=1##, which happens exactly if ##f## is a projection.

OK, I think I understand. I see how the existence of the open set U follows from continuity. ##U^c## is closed. If the space is normal, ##U^c## and ##\{x\}## are separated by open sets V,W. Let's say that V is the one that contains ##U^c##. Then {U,V} is an open cover. I assume that this is the open cover we use to find a partition of unity. Theorem 36.1 in Munkres says that a partition of unity exists, if the space is normal. I have used that the space is normal twice already, so I hope that ##C(\hat{\mathcal A})## is normal.

The partition of unity is a pair (G,H) of two continuous functions supported in U and V respectively, such that G+H=1 and both of their ranges are a subset of [0,1]. We will ignore H and only use G. Define M=sup G(x). Let t be a positive number such that ##M/t<\min\{1-f(x),f(x)\}##. Define g=G/t. We have
\begin{align}
&0=0+0\leq f(x)+g(x)=f(x)+G(x)/t\leq f(x)+M/t<f(x)+(1-f(x))=1\\
&1=1-0\geq f(x)-g(x) =f(x)-G(x)/t\geq f(x)-M/t > f(x)-f(x)=0.
\end{align} I also understand how this, in the commutative case, implies that the corresponding elements of ##\mathcal A## are effects. The range of one of these functions is equal to the spectrum of the corresponding element of ##\mathcal A##, so the spectra of the operators corresponding to the functions f+g and f-g are subsets of [0,1], and that makes them effects.

This was a lot harder than I expected at first, and it's only the commutative case. I haven't yet thought about whether I can use the commutative case to solve the non-commutative case, similar to how you did it for the converse statement. I have to go to bed, but I'll give it a try tomorrow.

micromass said:

I think an easier proof can be found in Conway's "A course in operator theory", Theorem 46.4.

That does look easier. I will study that version instead. If I need to learn more about the basics of von Neumann algebras, I will try to use the same book for that too.

Fredrik · Jul 30, 2014

Fredrik said:

This was a lot harder than I expected at first, and it's only the commutative case. I haven't yet thought about whether I can use the commutative case to solve the non-commutative case, similar to how you did it for the converse statement. I have to go to bed, but I'll give it a try tomorrow.

This turned out to be very easy. If p is an extreme point of ##\mathcal E(\mathcal A)##, then it's an extreme point of ##\mathcal E(C^*(\{1,p\}))##, and then the result we just proved tells us that p is a projection in ##C^*(\{1,p\})##, which implies that p is a projection in ##\mathcal A##.

I would of course give you a button-thanks for every one of your posts if it hadn't been for the new rule (that I disagree with) that prevents me from doing so.

Error message said:

We're glad you're happy with micromass! However, we wish to discourage giving thanks to the same user multiple times in a row, especially for the same thread. Please thank another worthy member and then you can return to thanking micromass

micromass · Jul 30, 2014

Fredrik said:

I would of course give you a button-thanks for every one of your posts if it hadn't been for the new rule (that I disagree with) that prevents me from doing so.

Haha, I'm already happy that I was able to help. It's not often I get to talk about operator theory

Does Busch's Theorem Offer a Simplified Proof of Gleason's Theorem?

Similar threads

Hot Threads

Recent Insights