- #176
vanesch
Staff Emeritus
Science Advisor
Gold Member
- 5,117
- 20
ttn said:I don't really understand this. I just don't know any details about formal Kolmogorov probability theory. In what way are the variables one "conditions on" there (I gather that's technically the wrong word, but I don't know what the right one is) different from regular variables in regular conditional probabilities?
Ok, I think this is crucial to all that follows. Maybe I got too much of a mouthful with "Kolmogorov" ; it is just standard probability theory. From the top of my head - correct me if I'm wrong - a probability measure according to Kolmogorov P is a mapping from a subset M of the power set of Omega into the interval of real numbers [0,1] such that:
P(Omega) = 1
P(empty set) = 0
P(A U B) = P(A) + P(B) if A and B disjoint
and some other, more subtle, properties making P into a measure,
see http://en.wikipedia.org/wiki/Kolmogorov_axioms
A, an element of M (and thus a subset of Omega), is called an "event", and P(A) is "the probability for the event A to happen".
For a finite set of elements Omega, M can be set equal to the powerset (the set of all subsets) of Omega.
These axioms define a standard probability distribution. Of course, for a given set Omega, there can be MANY DIFFERENT PROBABILITY DISTRIBUTIONS, and we can label some of them, with a PARAMETER SET a,b or L. But there is a difference between looking at different sets within one probability distribution, and looking at the probability of a set for different values of the parameter set, and that's the entire difference I tried to explain between the usage of | (which is WITHIN a single probability distribution), and the usage of ; which refers to swapping between different probability distributions.
As I said, in all considerations of "causality" and "locality" and "determinism" and so on, one has to ASSUME FREE CHOICE somehow, and depending on this free choice, we CHANGE THE PROBABILITY DISTRIBUTION. So all what depends on our free choice goes into parameters that tell us which probability distribution we are going to use. The free choice is the setting of Bob and Alice's analysers: they can decide that freely, and as a function of the choice they make, we have different probability distributions of how things will happen. ALL things that will happen. There is also an extra parameter included, which is the COMMON cause, L, and which can be seen as a free choice of some unknown individual (a little devil, if you want). It fixes the entire probability distribution.
However, an OUTCOME is not something that FIXES the probability distribution, it is part of what is described by that distribution. So it doesn't enter into any parameter list !
And how can it be that outcome independence is somehow built into the axioms of probability theory? What does this mean for OQM since that theory violates OI?
Ok, I formulated this badly, sorry. OI is not something that is "build into the axioms of probability theory", it is rather something that is well-defined within probability theory, but which I don't NEED. What I meant was, that P(A) and P(A|B) are two well defined quantities, meaning we can talk about P(A) without having to say that "it depends also on B or not".
In fact, P(A|B) is nothing else but P(A sect B) / P(B) ; so it is a derived concept. Saying that P(A|B) = P(A) just comes down to saying that
P(A sect B) = P(A) x P(B). We usually write P(A sect B) as P(A,B).
It makes perfectly sense to talk about P(A) and about P(A,B). These are numbers which are well defined if the probability distribution is well defined (meaning, the parameters which select the distribution from its family are fixed, in our case a, b and L).
I don't know now. You'll have to explain the difference between conditionalizing on a variable and regarding it as a parameter or whatever for
Kolmogorov.
As I said, the parameters select ONE of different probability distributions out of a family. Once we have our distribution, we can apply it to M. A conditional probability within this distribution is then nothing else but a shorthand for a fraction of two measures of this distribution.
You cannot write P(A ; a) = P(A and a) / P(a) if a is a parameter. You can however, write very well P(A |B ) = P(A sect B) / P(B) ; it is its very definition.
But as far as I know, Bell Locality is still the condition that
P(A|a,b,B,L) = P(A|a,L).
Which is to be re-written:
P(A|B ; a,b,L) = P(A ; a,L)
By definition, we have: P(A|B ; a,b,L) = P(A,B ; a,b,L) /P(B ; a,b,L)
Now, if we rewrite my "Bell condition" which is:
P(A,B ; a,b,L) = P(A ; a,L) x P(B ; b,L), together with the fact that P(B ; a,b,L) = P(B ; b,L) (does not depend on parameter a - that's information locality to me), and we fill it into the definition of P(A|B ; a,b,L) above,
we have:
P(A|B ; a,b,L) = P(A ; a,L) x P(B ; b,L) / P(B ; b,L) = P(A ; a,L) and we're home:
both statements are equivalent.
Just to repeat my request above, can you clarify how this applies to orthodox QM? Because sure in OQM, we don't have
P(A,B;a,b,L) = P(A;a,b,L) * P(B;a,b,L).
Right? Somehow you've got to "conditionalize" (or whatever) one of the two factors on the right on the other outcome (just like Bayes' rule requires). You seem to be saying that there is no need or ability to do this, yet OQM requires it...
I'm sorry that I misformulated this: I didn't mean to imply that in just any Kolmogorov system you have to have this factorisation of course ! What I meant to say (and I badly expressed myself) was:
P(A ; a,b,L) is perfectly well defined. You do not have to say that the expression is somehow "incomplete" because I didn't include B in the list to the right. I could have been talking about ANOTHER quantity P(A|B ; a,b,L) ; only, I didn't talk about it, I didn't need it, because I only wanted to demonstrate P(A,B | a,b,L) = P(A ; a,L) x P(B ; b,L).
That's a perfectly sensible statement, and the three quantities are well defined in just any Kolmogorov system (the equality, of course, is not always true and has to be demonstrated for the case at hand).
I could also talk about things like P(A|B ; a,b,L) and so on, but I simply didn't need to. It is not an ERROR to talk about P(A | a,b,L) and in doing so I do not make any assumption. That's what I put badly as "it is build into the axioms of probability theory".
I hate to make a fuss over terminology, but could you use the technical term "parameter independence" if that's what you mean? Or "signal locality" if that's what you mean? (And btw, these are not the same. Violating signal locality requires parameter-dependence *and* a sufficient control over the prepared initial state of the system.)
That is correct. There could be of course a conspiracy that L compensates for every change in a that I make. I assume of course same L.
Here you're sliding back and forth between "signal locality" and "what relativity requires." Remember, Bohmian Mechanics is also consistent with signal locality, yet somehow you (and most others) think that this theory is inconsistent with relativity. No double standards.
Ok, we've had this discussion already a few times. Because the statistical predictions of both theories are identical, there's no discrimination between both on those "black box" outcomes of course. It is a matter of esthetics of the inner workings. If you need to write that the state HERE is directly a function of the state (or its rate of change) THERE, in the equations, then this thing is not considered local, even if what you crank out of it doesn't see the difference. Sometimes this can be an artefact. For instance, the principle of minimum action is certainly not something local: you need to integrate over vastly remote pieces of space just to find out what you will do here. So that theory is a priori non-local. If you can rewrite it as a differential equation (Euler-Lagrange) then it has become local. But the result is the same.
I still don't understand what you think this proves. Is it: that a deterministic theory automatically respects "outcome independence"? I suppose that's true, especially if you *define* determinism in terms of
P(A|a,b,L)
and
P(B|a,b,L)
equalling either 0 or 1. But then, what's actually relevant is not that those probabilities equal {0,1}, but simply that you've written them without any "outcome dependence"! And obviously a theory with no outcome dependence will respect OI. But that has nothing to do with whether it's deterministic.
Again, I don't care about "outcome independence". I didn't need conditional probabilities at all. I needed to SHOW that P(A,B) factorizes into P(A) x P(B). This can be rewritten into something that uses outcome independence if you like, but I don't care.
What I wanted to show was that from determinism (all probabilities are 1 or 0), and from information locality (P(A;a,b,L) = P(A ; a,L) and P(B ; a,b,L) = P(B;b,L) ) follows the factorization statement that is Bell locality:
P(A,B ; a,b,L) = P(A ; a,L) x P(B ; b,L).
In all this, I never used a conditional probability (and hence didn't need to say "outcome independence"). I used a property of the parametrisation of the family of distributions (namely, that all distributions with same b and L give the same probabilities for events B, no matter what a is ; this comes down to saying that my free choice of a has no influence on the probabilities of events at Bob's) ; and I used a property of each individual distribution (namely determinism, so that all results of mappings P is 1 or 0).
From that, I derived P(A,B ; a,b,L) = P(A ; a,L) x P(B ; b,L).
That's sufficient. I can now of course bring one right hand side member to the left, and write:
P(A,B ; a,b,L) / P(B ; b,L) = P(A ; a,L)
and use the definition of conditional probability on the left:
P(A|B ; a,b,L) = P(A ; a,L)
and you will be happy because I now derived some "outcome independence" ; but first of all this makes no mathematical sense in the case of deterministic distributions because I can divide by 0 (P(B ; b,L) is often 0), and second, it is only the use of a definition. Mind you, I didn't ASSUME this: I demonstrated it (although by dividing by 0).
As far as I can tell, this is true by fiat only. You define "determinism" in a way that precludes outcome dependence from the very beginning. But this is misleading and unnecessary, since we know that Bell Locality = OI and PI *regardless* of whether or not we have also determinism.
I would really like to know where I USED "outcome independence" and how this is defined. I used only a parametrized set of distributions, which are parametrized by a,b and L (meaning that all my probabilities are fixed when these parameters are fixed) ; then I used different events (subsets of omega), namely A, B and (A sect B), on which I applied my now well defined distribution. I showed that under the conditions I posed, P(A,B) = P(A) x P(B). That's all. Never I needed to use a conditional probability so I don't see where I made such an assumption.
cheers,
Patrick.