Is Bell's Logic Aimed at Decoupling Correlated Outcomes in Quantum Mechanics?

Maaneli · Jun 18, 2010

JesseM said:

The universe has an objectively real state at every moment, I don't know what it would mean to say "the universe is something objectively real" apart from this.

Can you define what a 'state' is?

JesseM said:

The definition is broad--in some universes satisfying the definition it might be possible to break down the "state of the universe at a given moment" into a collection of local states of each point in space at that time, but in others it might not be.

The only universe I am concerned about is one in which the principle of local causality is applicable.

JesseM said:

Instead of saying that a theory is "local" or "nonlocal" as a whole, let's say that some mathematically-definable element of a theory is local if 1) all facts about the value of this element can be broken down into local facts about individual points in spacetime, and 2) the value at one point is only causally influenced by local facts in the point's past light cone. So in this case, if the "element" in the copenhagen interpretation is the density matrix for a measurement at a single place and time, then I think it'd make sense to say this element is local even if the copenhagen interpretation is not realist, and even though other elements of the theory like the wavefunction for entangled particles cannot really be considered local. In the case of a local realist theory, the "element" would consist of all objective facts about the state of the universe.

If the 'element' would consist of all objective facts about the state of the universe, then the density matrix cannot be part of them. A density matrix represents a state of knowledge about a system, not the objects facts about the system.

In any case, based on what you said, it seems that you would have to define a locally causal theory as one in which every objectively real element of a theory satisfies your 1) and 2). So if someone denies your definition of realism, then you cannot even formulate a locally causal theory.

DrChinese · Jun 18, 2010

Maaneli said:

Also, we know that realism is not a sufficient premise for Bell. After all, there exist theories of nonlocal (contextual or noncontextual) beables which violate the Bell inequalities.

This is a debatable point. First, you and every other Bohmian I know agree that nature is contextual. So that is pretty well rejecting realism a priori (which is a reasonable view).

Second, there are some who feel that the Bohmiam program asserts forms of realism that are experimentally excluded. Now, I realize that Bohmians reject this evidence. I am just saying it is debatable. I personally am accepting that the Bohmian viewpoint is NOT clearly ruled out experimentally. Although it would be nice to see the Bohmian side come up with something first for a change. (Like some/any prediction that could be tested.)

And third, I personally think that realism IS enough for the Bell result (I have presented this logic previously). But that is a minority view. There, you have it and heard it here first. I am stating something deviant.

Maaneli · Jun 18, 2010

DrChinese said:

This is a debatable point. First, you and every other Bohmian I know agree that nature is contextual. So that is pretty well rejecting realism a priori (which is a reasonable view).

The acceptance of contextuality only implies a rejection of the realism in, say, classical mechanics. But in a contextual theory like deBB, the particles always have definite positions in spacetime, whether or not it they are measured. So I don't see how one can think that the contextuality of deBB theory implies a rejection of realism all together.

DrChinese said:

Second, there are some who feel that the Bohmiam program asserts forms of realism that are experimentally excluded. Now, I realize that Bohmians reject this evidence. I am just saying it is debatable.

Who claims this? I'm not familiar with anyone in the physics literature who seriously argues that the realism implied by deBB is experimentally excluded. Certainly Tony Leggett does not assert this. Nor does Zeilinger.

DrChinese said:

I personally am accepting that the Bohmian viewpoint is NOT clearly ruled out experimentally. Although it would be nice to see the Bohmian side come up with something first for a change. (Like some/any prediction that could be tested.)

Do you know of Valentini's work on nonequilibrium deBB field theory in inflationary cosmology?

I've also developed some semiclassical deBB gravity models which could, in principle, be experimentally discriminated from standard semiclassical gravity theory, through the use of matter-wave interferometry with macromolecules. But that's currently new and unpublished work.

DrChinese said:

And third, I personally think that realism IS enough for the Bell result (I have presented this logic previously). But that is a minority view. There, you have it and heard it here first. I am stating something deviant.

Yeah, perhaps you won't be surprised if I say that I'm extremely skeptical of this claim.

But I might like to see that argument just for kicks.

billschnieder · Jun 18, 2010

JesseM said:

It often seems like you may be intentionally playing one-upmanship games where you snip out all the context of some question or statement I ask and make it sound like I was confused about something very trivial

Pot calling kettle black.

This scenario, where there is a systematic bias in how doctors assign treatment which influences the observed correlations in frequencies between treatment and recovery in the sample, is a perfectly well-defined one

And this is different from the issue we are discussing how exactly. Haven't I told you umpteenth times that Aspect-type experimenters are unable to make sure there is no systematic bias in their experiments? How do you expect me to continue a discussion with you if you ignore everything I say and keep challenging every tiny tangential issue, like the meaning of fair, or the meaning of population. You think I have all the time in the world to be following you down these rabbit trails which are not directly relevant to the issue being discussed. Have you noticed every of your responses is now three posts long, mostly filled with tangential issues. Are you unable to focus a in on just what is relevant? You may have the time for this but I don't.

In general I notice that you almost always refuse to answer simple questions I ask you about your position

See the previous paragraph for the reason why. I answer the ones that I believe will further the relevant discussion and ignore temptations to go down yet another rabbit trail.

"Rational degree of belief" is a very ill-defined phrase. What procedure allows me to determine the degree to which it is rational to believe a particular outcome will occur in a given scenario?

It is well defined to me. If you disagree, give an example and I will show you how a rational degree of belief can be formed. Or better, give an example in which you think the above definition does not apply. My definition above covers both the "frequentists" and "bayesian" views as special cases, each of which is not a complete picture by itself. If you think it does not, explain in what way it does not.

But the frequentist interpretation is just about hypothetical repetitions, which can include purely hypothetical ideas like "turning back the clock" and running the same single experiment over again at the same moment (with observable conditions held the same but non-observed conditions, like the precise 'microstate' in a situation where we have only observed the 'macrostate', allowed to vary randomly) rather than actually repeating it at successively later times (which might be impossible because the original experiment destroyed the object we were experimenting on, say).

So why are you so surprised when I tell you that such idealized problems, which presuppose infinite independent repetitions of a "random experiment" can not be directly compared to anything real, where infinite repetition of a "random experiment" is not possible? If Bell's theorem were an entirely theoretical exercise with no comparison being made to reality, and no conclusions about reality being drawn from it, do you really believe we would be having this discussion?

If it is your view that Bell's inequalities do not say anything about reality, and no reasonable physicist can possibly draw any conclusions about the real world from Bell's theorem. Then we can end this quibble, because you and I will be in full agreement. Is that what you are saying?

The comment above says nothing of the sort. I'm just saying that to talk about "probability" in the frequentist interpretation you need to define the conditions that you are imagining being repeated in an arbitrarily large number of trials.

No I don't. You are the one who insists probability must be defined that way not me.

would you agree that when defining the sample space, we must define what process was used to assign treatments to patients, that a sample space where treatment was assigned by doctors would be a different one than a sample space where treatment was assigned by a random number generator on a computer?

Yes, I have told you as much recently. But what has that got to do with anything. All I am telling you is that you can not compare a probability defined on one sample space with one defined on another. My point is, just because you use a random number generator does not mean you have the same probability space like the idealized infinitely repeated probability you theorized about. What don't you understand about that.

I'm not asking you to "compare probabilities defined on different probability spaces", and Bell's argument doesn't require you to do that either.

Oh, but that is what you are doing by comparing Bell's inequalities with the results of Aspect type experiments whether Bell "requires" it or not. It is not about what Bell requires, it is about what is done every time a real experiment is compared to Bell's inequalities.

Sure it would be. If treatment was assigned by a random number generator, then in the limit as the number of trials went to infinity the probability of any correlation between traits of patients prior to treatment (like large kidney stones) and the treatment they were assigned would approach 0.

How exactly can actual doctors doing actual experiments repeat the trial to infinity?

This is just because there isn't any way the traits of patients would causally influence the random number generator so that there would be a systematic difference in the likelihood that patients with different versions of a trait (say, large vs. small kidney stones) would be assigned treatment A vs. treatment B. Do you disagree?

If you were repeating the experiment an infinite number of times with the random number generator producing two groups every time, then I agree that theoretically, the average composition for both groups will tend to wards the same value. But in the real world, you do not have an infinite number of people with kidney-stones, and it is impossible to repeat the experiment an infinite number of times. Therefore, unless the experimenters know that the size of the stones matter, and specifically control for that, the results of their single experiment, can not be compared to any idealized, theoretical result obtained by repeating a hypothetical experiment an infinite number of times. Is this too difficult to understand?

billschnieder · Jun 18, 2010

JesseM said:

So, do you agree with my statement that of these two, Only the second sense of "fair sample" is relevant to Bell's argument?

The concept that a fair sample is needed to be able to draw inferences about the population from a sample of it is relevant to Bell's argument, irrespective of which specific type of fair sample is at issue in a specific experiment.

In post #91 you said the following, numbered for convenience

JesseM said:

As before, you need to explain what "the population" consists of.
1) Again, does it consist of a hypothetical repetition of the same experimental conditions a much larger (near-infinite number of times)? If so, then by definition the actual sample could not be "systematically biased" compared to the larger population, since the larger population is defined in terms of the same experimental conditions.
2) Perhaps you mean repeating similar experimental conditions but with ideal detector efficiency so all particle pairs emitted by the source are actually detected, which would be more like the meaning of the "fair sampling assumption"?

1) Wrong. IF you define the population like that, the actual sample in a real experiment can still be systematically biased compared to the large population, IF those doing the experiment have no way to ensure that they are actually repeating the same experiment multiple times, even if it were possible to actually repeat it multiple times.

2) A fair sample in the context of Aspect-type experiments means that the probabilities of non-detection at Alice and Bob are independent of each other, and also independent of the hidden elements of reality.

To make the question more precise, suppose all of the following are true:

1. We repeat some experiment with particle pairs N times and observe frequencies of different values for measurable variables like A and B

2. N is sufficiently large such that, by the law of large numbers, there is only a negligible probability that these observed frequencies differ by more than some small amount [tex]\epsilon[/tex] from the ideal probabilities for the same measurable variables (the 'ideal probabilities' being the ones that would be seen if the experiment was repeated under the same observable conditions an infinite number of times)

3. Bell's reasoning is sound, so he is correct in concluding that in a universe obeying local realist laws (or with laws obeying 'local causality' as Maaneli prefers it), the ideal probabilities for measurable variables like A and B should obey various Bell inequalities

...would you agree that if all of these are true (please grant them for the sake of the argument when answering this question, even though I know you would probably disagree with 3 and perhaps also doubt it is possible in practice to pick a sufficiently large N so that 2 is true), then the experiment constitutes a valid test of local realism/local causality, so if we see a sizeable violation of Bell inequalities in our observed frequencies there is a high probability that local realism is false? Please give me a yes-or-no answer to this question.

No I do not agree. The premises you presented are not sufficient (even if they were all true) for the statement in bold to be true. Here is an example I have given you in a previous thread which makes the point clearer I believe:

The point is that certain assumptions are made about the data when deriving the inequalities, that must be valid in the data-taking process. God is not taking the data, so the human experimenters must take those assumptions into account if their data is to be comparable to the inequalities.

Consider a certain disease that strikes persons in different ways depending on circumstances. Assume that we deal with sets of patients born in Africa, Asia and Europe (denoted a,b,c). Assume further that doctors in three cities Lyon, Paris, and Lille (denoted 1,2,3) are are assembling information about the disease. The doctors perform their investigations on randomly chosen but identical days (n) for all three where n = 1,2,3,...,N for a total of N days. The patients are denoted Alo(n) where l is the city, o is the birthplace and n is the day. Each patient is then given a diagnosis of A = +1/-1 based on presence or absence of the disease. So if a patient from Europe examined in Lille on the 10th day of the study was negative, A3c(10) = -1.

According to the Bell-type Leggett-Garg inequality

Aa(.)Ab(.) + Aa(.)Ac(.) + Ab(.)Ac(.) >= -1

In the case under consideration, our doctors can combine their results as follows

A1a(n)A2b(n) + A1a(n)A3c(n) + A2b(n)A3c(n)

It can easily be verified that by combining any possible diagnosis results, the Legett-Garg inequalitiy will not be violated as the result of the above expression will always be >= -1, so long as the cyclicity (XY+XZ+YZ) is maintained. Therefore the average result will also satisfy that inequality and we can therefore drop the indices and write the inequality only based on place of origin as follows:

<AaAb> + <AaAc> + <AbAc> >= -1

Now consider a variation of the study in which only two doctors perform the investigation. The doctor in Lille examines only patients of type (a) and (b) and the doctor in Lyon examines only patients of type (b) and (c). Note that patients of type (b) are examined twice as much. The doctors not knowing, or having any reason to suspect that the date or location of examinations has any influence decide to designate their patients only based on place of origin.

After numerous examinations they combine their results and find that

<AaAb> + <AaAc> + <AbAc> = -3

They also find that the single outcomes Aa, Ab, Ac, appear randomly distributed around +1/-1 and they are completely baffled. How can single outcomes be completely random while the products are not random. After lengthy discussions they conclude that there must be superluminal influence between the two cities.

But there are other more reasonable reasons. Note that by measuring in only two citites they have removed the cyclicity intended in the original inequality. It can easily be verified that the following scenario will result in what they observed:

- on even dates Aa = +1 and Ac = -1 in both cities while Ab = +1 in Lille and Ab = -1 in Lyon
- on odd days all signs are reversed

In the above case
<A1aA2b> + <A1aA2c> + <A1bA2c> >= -3
which is consistent with what they saw. Note that this equation does NOT maintain the cyclicity (XY+XZ+YZ) of the original inequality for the situation in which only two cities are considered and one group of patients is measured more than once. But by droping the indices for the cities, it gives the false impression that the cyclicity is maintained.

The reason for the discrepancy is that the data is not indexed properly in order to provide a data structure that is consistent with the inequalities as derived.Specifically, the inequalities require cyclicity in the data and since experimenters can not possibly know all the factors in play in order to know how to index the data to preserve the cyclicity, it is unreasonable to expect their data to match the inequalities.

For a fuller treatment of this example, see Hess et al, Possible experience: From Boole to Bell. EPL. 87, No 6, 60007(1-6) (2009)

The key word is "cyclicity" here. Now let's look at various inequalities:

Bell's equation (15):
1 + P(b,c) >= | P(a,b) - P(a,c)|
a,b, c each occur in two of the three terms. Each time together with a different partner. However in actual experiments, the (b,c) pair is analyzed at a different time from the (a,b) pair so the bs are not the same. Just because the experimenter sets a macroscopic angle does not mean that the complete microscopic state of the instrument, which he has no control over is in the same state.

CHSH:
|q(d1,y2) - q(a1,y2)| + |q(d1,b2)+q(a1,b2)| <= 2
d1, y2, a1, b2 each occur in two of the four terms. Same argument above applies.

Leggett-Garg:
Aa(.)Ab(.) + Aa(.)Ac(.) + Ab(.)Ac(.) >= -1

All of your premises could be true, and you will still not avoid the pitfall, if the data is not indexed in accordance with the expectations of the inequalities. But it is impossible to do that.

billschnieder · Jun 18, 2010

JesseM said:

If the "population" was explicitly defined in terms of an infinite set of repetitions of the exact observable experimental conditions you were using, then by definition your experimental conditions would not show any systematic bias and would thus be a "fair sample". And Bell's theorem doesn't assume anything too specific about the observed experimental conditions beyond some basic criteria like a spacelike separation between measurements (though it may be that 100% detector efficiency is needed as one of these criteria to make the proof rigorous, in which case a frequentist would only say that Bell's inequalities would be guaranteed to hold in an infinite repetition of an experiment with perfect detector efficiency, and any actual experiment with imperfect efficiency could be a biased sample relative to this infinite set)

Is it your claim that Bell's "population" is defined in terms of "an infinite set of repetitions of the exact observable experimental conditions you were using"? If that is what you mean here then I fail to see the need to make any fair sampling assumption at all. Why would the fact that detectors are not efficient not already be included in what you call "the exact observable experimental conditions you were using"? So either, 1) that is not what Bell's population is defined as, or 2) No experimental condition testing Bell's inequalities will ever be unfair, so there is no point even making a "fair sampling assumption". Or maybe you do not understand that fair sampling is not about detector efficiency. I could have a fair sample with 1% detector efficiency, provided the rejection of photons was not based on a property of the photons themselves.

If the apparatus "rejects photons" then doesn't that mean you don't have "a 100% efficient detector", by definition?

No it doesn't mean that at all. In Aspect type experiments, you have a series of devices like beam splitters or cross-polarizers etc, not to talk of coincidence counters, before you have any detector. The detector is the device which actually detects a photon. However, even if your detector is 100% efficient, and detects everything that reaches it, it doesn't mean everything is reaching it. The rest of the apparatus could be eliminating photons prior to that.

billschnieder · Jun 19, 2010

JesseM said:

For example, imagine that I come to you today and say, I want to do an experiment on dolphins, give me a representative sample of 1000 dolphins. Without knowing anything about the details of my experiment, and all the parameters that affect the outcome of my experiment, could you explain to me how you will go about generating this "random list of dolphins", also tell me what an infinite number of times means in this context. If you could answer this question, it will help tremendously in understanding your point of view.

I can't answer without a definition of what you mean by "representative sample"--representative of what?

Representative of the entire dolphin population.

You can only define "representative" by defining what conditions you are imagining the dolphins are being sampled

Oh, so you are saying you need to know the "hidden" factors in order to be able to generate a fair sample. So then you agree that without a clear understanding of what factors are important for my experiment, you can not possibly produce a representative sample. This is what I have been telling you all along. Do you see now how useless a random number generator will be in such a case, where you have no clue what the "hidden" factors are?

billschnieder · Jun 19, 2010

fair sampling and the the scratch lotto-card analogy

Let us now go back to your famous scratch-lotto example:

JesseM said:

The scratch lotto analogy was only a few paragraphs and would be even shorter if I didn't explain the details of how to derive the conclusion that the probability of identical results when different boxes were scratched should be greater than 1/3, in which case it reduces to this:

Perhaps you could take a look at the scratch lotto analogy I came up with a while ago and see if it makes sense to you (note that it's explicitly based on considering how the 'hidden fruits' might be distributed if they were known by a hypothetical observer for whom they aren't 'hidden'):

Suppose we have a machine that generates pairs of scratch lotto cards, each of which has three boxes that, when scratched, can reveal either a cherry or a lemon. We give one card to Alice and one to Bob, and each scratches only one of the three boxes. When we repeat this many times, we find that whenever they both pick the same box to scratch, they always get the same result--if Bob scratches box A and finds a cherry, and Alice scratches box A on her card, she's guaranteed to find a cherry too.

Classically, we might explain this by supposing that there is definitely either a cherry or a lemon in each box, even though we don't reveal it until we scratch it, and that the machine prints pairs of cards in such a way that the "hidden" fruit in a given box of one card always matches the hidden fruit in the same box of the other card. If we represent cherries as + and lemons as -, so that a B+ card would represent one where box B's hidden fruit is a cherry, then the classical assumption is that each card's +'s and -'s are the same as the other--if the first card was created with hidden fruits A+,B+,C-, then the other card must also have been created with the hidden fruits A+,B+,C-.

Is that too long for you? If you just have a weird aversion to this example (or are refusing to address it just because I have asked you a few times and you just want to be contrary)

I have modified it to make the symbols more explicit and the issue more clear as follows:

Suppose we have a machine that generates pairs of scratch lotto cards, each of which has three boxes (1,2,3) that, when scratched, can reveal either a cherry or a lemon (C, L). We give one card to Alice and one to Bob, and each scratches only one of the three boxes. Let us denote the outcomes (ij) such that (CL) means, Alice got a cherry and Bob got a lemon). There are therefore only 4 possible pairs of outcomes: CC, CL, LC, LL. Let us denote the pair of choices by Alice and Bob as (ab), for example (11) means they both selected box 1 on their cards, and (31) means Alice selected box 3, and Bob selected box 1. There are therefore 9 possible choice combinations: 11, 12, 13, 21, 22, 23, 31, 32 and 33.

When we repeat this many times, we find that
(a) whenever they both pick the same box to scratch, they always get the same result. That is whenever the choices are, 11, 22 or 33, the results are always CC or LL.
(b) whenever they both pick different boxes to scratch, they get the same results only with a relative frequency of 1/4.

How might we explain this?
We might suppose that there is definitely either a cherry or a lemon in each box, even though we don't reveal it until we scratch it. In which case, there are only 8 possible cards that the machine can produce: CCC, CCL, CLC, CLL, LCC, LCL, LLC, LLL. To explain outcome (a) then, we might say that "hidden" fruit in a given box of one card always matches the hidden fruit in the same box of the other card. Therefore the machine must always send the same type of card to Bob and Alice. However, doing this introduces a conflict for outcome (b) as follows:

Consider the case where the cards sent to Bob and Alice were of the LLC type. Since outcome (b) involves Alice and Bob scratching different boxes, there are six possible ways they could scratch.

12LL (ie, Alices scratches box 1, Bob scratches Box 2, Alice gets Lemon, Bob gets Lemon)
21LL
13LC
31CL
23LC
32CL (ie, Alices scratches box 3, Bob scratches Box 2, Alice gets Cherry, Bob gets Lemon)

Out of the 6 possible outcomes, only 2 (the first two) correspond to the same outcome for both Alice and Bob. Therefore the relative frequency will be 2/6 = 1/3 not 1/4 as observed. This is the case for all the types of cards produced. This is analogous to the violation of Bell's inequalities.

According to JesseM, it is impossible to explain both outcome (a) and outcome (b) with an instruction set as the above illustration shows.

JesseM,
Does this faithfully reflect the example you want me to address? If not point out any errors and I will amend as necessary.

DrChinese · Jun 19, 2010

Maaneli said:

Yeah, perhaps you won't be surprised if I say that I'm extremely skeptical of this claim.

But I might like to see that argument just for kicks.

It's short and sweet, but you probably won't accept it any more than Norsen did.

A single particle, Alice, has 3 elements of reality at angles 0, 120, 240 degrees. This is by assumption, the realistic assumption, and from the fact that these angles - individually - could be predicted with certainty.

It is obvious from the Bell program that there are NO datasets of Alice which match the QM expectation value. Ergo, the assumption is invalid. And you don't need to consider settings of Bob at all. You simply cannot construct the Alice dataset. QED.

The key difference is that the elements of reality are NOT referring to separate particles. They never were intended to! All the talk about Bob's setting affecting Alice's outcome only relates to Bell tests. But it should be clear that there is no realistic Alice who can match the QM expectation value.

billschnieder · Jun 19, 2010

fair sampling and the the scratch lotto-card analogy

(continuing from my last post)
So far, the conundrum is the idea that the only case which explains outcomes (a) produce relative frequencies (1/3) for outcome (b) which are significantly higher than those predicted by QM and observed in experiments (1/4).

There is however one interesting observation not included in the above treament. In all experiments performed so far, most of the particles sent to the detector are undetected. In the situation above, it is equivalent to saying, not all the cards sent to Alice or Bob reveal a fruit when scratched.

The alternative explanation:
A more complete example then must include "no-fruit" (N) as a possible outcome. So that in addition to the four outcomes listed initially (CC, CL, LC, LL) we must add the four cases for which only one fruit is revealed for each pair of cards sent (CN, NC, CL, LC) and the one case in which no fruit is revealed for each pair sent (NN). Interestingly, in real experiments, whenever only one of the pair is detected, the whole pair is discarded. This is purpose of coincidence circuitary used in Aspect-type experiments.

One might explain it by supposing that a "no-fruit" (N) result is obtained whenever Alice or Bob makes an error by scratching the chosen box too hard so that they also scratch off the hidden fruit underneath it. In other words, their scratching is not 100% inefficient. However, no matter how low their efficiencly, if this mistake is done randomly enough, the sample which reveals a fruit will still be representative of the population sent from the card machine, and by considering just those cases in which no mistake was made during scratching (cf. using coincidence circuitary), the conundrum remains. Therefore in this case, the efficiency of the detector does not matter.

There is yet another posibility. What if the "no-fruit" (N) result, is an instruction carried by the card itself rather than a result of inefficient scratching. So that instead of always having either a cherry or a lemon in each box, we allow for the posibility that some boxes are just left empty (N) and will therefore never produce a fruit no matter how efficiently they scratch.

Keeping this in mind, let us now reconsider the LLC case we discussed above, except that the machine has the freedom to generate the pair such that in one card of the pair generated at a time, one of the boxes is empty (N). For example, the card LNC is sent to Alice while the card LLC is sent to Bob. Note that now the machine is no longer sending exactly the same card to both Alice and Bob. The question then is, can this new instruction set explain both outcomes (a) and (b)? Let us verify:

(a) When both Alice and Bob select the same box to scratch, the possible outcomes for the (LNC,LLC) pair of cards sent are 11LL, 33CC, 22NL. However, since the 22NL case results in only a single fruit, it is rejected as an error case. Therefore in every case in which they both scratch the same box and they both reveal a fruit, they always reveal the same fruit. Outcome (a) is therefore explained.

(b) What about outcome (b)? All the possible results for when they select different boxes from the (LNC,LLC) pair are 12LL, 21NL, 13LC, 31CL, 23NC, 32CC. As you can see, in 2 of the 6 possible cases, only a single fruit is revealed. Therefore we reject those two and have only 4 possible outcomes for which they scratch a different box and both of them observe a fruit (12LL, 13LC, 31CL, 32CC). However, in only one of these, do they get the same fruit. Therefore in one out of the four possible outcomes in which they both scratch different boxes and both get a fruit, they get the same fruit (32CC), corresponding to a relative frequency of 1/4, just as was predicted by QM and observed in real experiments.

The same applies to all other possible instruction sets in which the machine has the freedom to put an empty box in one of the boxes of the pair sent out. The conundrum is therefore resolved.

DrChinese · Jun 19, 2010

billschnieder said:

(continuing from my last post)
So far, the conundrum is the idea that the only case which explains outcomes (a) produce relative frequencies (1/3) for outcome (b) which are significantly higher than those predicted by QM and observed in experiments (1/4).

There is however one interesting observation not included in the above treament. In all experiments performed so far, most of the particles sent to the detector are undetected. In the situation above, it is equivalent to saying, not all the cards sent to Alice or Bob reveal a fruit when scratched.

The alternative explanation:
...

Well, yes and no. This is an area I am fairly familiar with.

First, we need to agree that the FULL universe in the LR alternative makes a different prediction than what is observed. Therefore it does not match the QM expectation value and Bell's Inequality is respected. Bell's Theorem stands.

Second, it is hypothetically possible to attempt a treatment as you describe. This does have some superficial similarity to the simultation model of De Raedt et al. However, there are in fact extremely severe constraints and getting somewhere with your idea is MUCH more difficult than you may be giving credit for. Keep in mind this approach IS NOT RULED OUT BY BELL'S THEOREM. I capitalized those letters because we are moving from one track to an entirely different one. As we will see, there are still elements of Bell's logic to consider here.

Third, let's consider your hypothesis and the constraints it must satisfy. I will just supply a couple so we can have a starting point.

a) The full universe must obey the Bell Inequality, and most authors pick a straight line function to stay as close to the QM expectation as possible. This means that there exists some BIAS() function which accounts for the different between the full universe and the sample actually detected. I will discuss this function in a followup post.
b) The alternative model you suggest will make experimentally verifiable predictions. For example, you must be able to show that there are specific parts of the apparatus that are responsible for the absorption of the "missing" radiation. So keep in mind that the complete absence of such effect is a powerful counterexample.

Now, I realize that you may think something like: "a) and b) don't matter, it at least proves that a local realistic position is tenable." But it actually doesn't, at least not in the terms you are thinking. Yes, I freely acknowledge that Bell's Theorem does NOT rule out LR theories that yield DIFFERENT predictions than QM. I think this is generally accepted as possible by the physics community. It is the idea that QM and LR are compatible that is ruled out. So this means that a) and b) are important. As mentioned I will discuss this in a followup post.

DrChinese · Jun 19, 2010

I am attaching a graph of the BIAS() function for a local realistic theory in which Bell's Inequality is respected, as you are suggesting, because the full universe is not being detected. Your hypothesis is what I refer to as the Unfair Sampling Assumption. The idea is that an Unfair Sample can explain the reason why local realism exists but QM predictions hold in actual experiments.

Your LR candidate does not need to follow this graph, but it will at least match it in several respects. Presumably you want to have a minimal bias function, so I have presented that case.

You will notice something very interesting about the bias: it is not equal for all Theta! This is a big problem for a local realistic theory. And why is that? Because Theta should not be a variable in a theory in which Alice and Bob are being independently measured. On the other hand, if your theory can explain that naturally, then you would be OK. But again, this is where your theory will start making experimentally falsifiable predictions. And that won't be so easy to get around, considering every single prediction you make will involve an effect that no one has ever noticed in hundreds of thousands of experiments. So not impossible, but very difficult. Good luck!

billschnieder · Jun 19, 2010

DrChinese said:

Well, yes and no.

Just to be clear, I need some clear answers from you before I proceed to talk about your bias function.
1). Do you agree that my explanation above explains the situation through "instruction sets" -- something Mermin said was not possible?

2) Do you at least admit that Mermin was wrong in declaring that in this specific example which he originated, it is impossible to explain the outcome through an instruction set?

3) Do you admit that the way my explanation works, more closely matches real Aspect-type experiments than Mermin's/JesseM's original example in which non-detection is not considered?

4) Do you agree that without coincidence counting, Bell's inequalities are not violated? In other words, Bell's inequalities are only violated in real experiments when the "full universe" is limited to the full universe of coincidence counts, rather than the "full universe" of emissions from the detector? If you disagree, please let me know which "full universe" you are referring to.

DrChinese · Jun 20, 2010

billschnieder said:

Just to be clear, I need some clear answers from you before I proceed to talk about your bias function.
1). Do you agree that my explanation above explains the situation through "instruction sets" -- something Mermin said was not possible?

2) Do you at least admit that Mermin was wrong in declaring that in this specific example which he originated, it is impossible to explain the outcome through an instruction set?

3) Do you admit that the way my explanation works, more closely matches real Aspect-type experiments than Mermin's/JesseM's original example in which non-detection is not considered?

4) Do you agree that without coincidence counting, Bell's inequalities are not violated? In other words, Bell's inequalities are only violated in real experiments when the "full universe" is limited to the full universe of coincidence counts, rather than the "full universe" of emissions from the detector? If you disagree, please let me know which "full universe" you are referring to.

1. No one has ever - that I know of - said an Instruction Set explanation which does NOT match QM expectation value is impossible.

2. No, Mermin is completely correct.

3. No, there is absolutely no justification whatsoever for your ad hoc model. I have seen this plenty of times previously. For example, the graph I posted was created last year during similar discussion with someone else.

Please look at what I wrote above: a) your hypothesis does not violate Bell's Theorem; and b) your "model", which actually does NOT explain anything at all, would be susceptible to experimental falsification. IF you made any specific prediction, THEN I am virtually certain that existing experiments would prove it wrong. Of course, you would need to make one first. On the other hand, the QM model has been subjected to a barrage of tests and has passed all.

4. Sure, the full universe could consist of photons which are not being detected today. Those photons, hypothetically, could have attributes which are different, on average, than those that were detected. No argument about the principle.

But that would be hotly contested if you actually came out with a model (which you obviously have not). The reason is that there is substantial evidence that no such thing actually occurs! I am not sure how much you know about the generation and detection of entangled particles, but they are not limited to photons. And the statistics don't really leave a whole lot of room for the kind of effect you describe. Particles with mass can be more accurately detected than photons as they have a bigger footprint. For example, Rowe's experiment sees violation of a Bell inequality with detection of the full sample of ions.

http://www.nature.com/nature/journal/v409/n6822/full/409791a0.html

So my point is that it is "easy" to get around Bell by predicting a difference with QM. But that very difference leads to immediate conflict with experiment. That is why Bell's Theorem is so important.

DrChinese · Jun 20, 2010

From the abstract to the Rowe paper referenced above:

"Local realism is the idea that objects have definite properties whether or not they are measured, and that measurements of these properties are not affected by events taking place sufficiently far away. Einstein, Podolsky and Rosen used these reasonable assumptions to conclude that quantum mechanics is incomplete. Starting in 1965, Bell and others constructed mathematical inequalities whereby experimental tests could distinguish between quantum mechanics and local realistic theories. Many experiments (1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) have since been done that are consistent with quantum mechanics and inconsistent with local realism. But these conclusions remain the subject of considerable interest and debate, and experiments are still being refined to overcome 'loopholes' that might allow a local realistic interpretation. Here we have measured correlations in the classical properties of massive entangled particles (Be+ ions): these correlations violate a form of Bell's inequality. Our measured value of the appropriate Bell's 'signal' is 2.25 plus/minus 0.03, whereas a value of 2 is the maximum allowed by local realistic theories of nature. In contrast to previous measurements with massive particles, this violation of Bell's inequality was obtained by use of a complete set of measurements. Moreover, the high detection efficiency of our apparatus eliminates the so-called 'detection' loophole."

The first sentence should be recognized as something I have said many times on this board, in various ways. Namely, there are 2 critical assumptions associated with local realism, not 1. Realism being the existence of particle properties independent of the act of observation; and locality being the idea that those properties are not affected by spacelike separated events.

billschnieder · Jun 20, 2010

DrChinese said:

1. No one has ever - that I know of - said an Instruction Set explanation which does NOT match QM expectation value is impossible.

Mermin said:

There is no conceivable way to assign such
instruction sets to the particles from one run to the next that can account for the fact that in all runs
taken together, without regard to how the switches are set, the same colors flash half the time.
...
Therefore if instructions sets exist, the same colors will flash in at least 5/9 of all the runs, regardless of how
the instruction sets are distributed from one run of the demonstration to the next.
This is Bell’s theorem (also known as Bell’s inequality) for the gedanken demonstration.
But in the actual gedanken demonstration the same colors flash only 1⁄2 the time.
The data described above violate this Bell’s inequality, and therefore there can be no instruction sets.

If you don’t already know how the trick is done, may I urge you, before reading how the gedanken
demonstration works, to try to invent some other explanation for the first feature of the data that does not
introduce connections between the three parts of the apparatus or prove to be incompatible with the second
feature.

I just gave you an instruction set explanation for Mermin's gedanken experiment, which he clearly says above is impossible. He did not restrict his statement above to only QM compatible ones, so QM does not even come in for this specific question which I asked.

2. No, Mermin is completely correct.

See previous quote from Mermin himself.

3. No, there is absolutely no justification whatsoever for your ad hoc model. I have seen this plenty of times previously. For example, the graph I posted was created last year during similar discussion with someone else.

Yes there is justification, which I clearly explained. In All Aspect type experiments ever performed, only a small proportion of the photons are detected. My model is therefore more representative of the real world than Mermin's, or JesseM's in which they do not account for non-detection.

Please look at what I wrote above: a) your hypothesis does not violate Bell's Theorem; and b) your "model", which actually does NOT explain anything at all, would be susceptible to experimental falsification.

There is already an implicit prediction there, which is contrary to the prediction implicit in the fair sampling assumption. I don't need any extra prediction. The fair sampling prediction is that the particles detected are a fair representation of the universe of particles emitted, my prediction is that the particles detected are not a fair representation of the universe of particles emitted. Even if you detect all particles, you still have to make sure you avoid the pitfall mentioned in post #110. You can measure the full universe of particles and yet not have a fair sampling if you do not index the particles correctly

But that would be hotly contested if you actually came out with a model (which you obviously have not).

I did come up with a an instruction set model for Mermin's gedanken experiment did I not? That is what I told you I was going to do. I never promised I would create a full-blown LR quantum theory within a discussion thread. Is that what you were expecting?

BTW the issue here is not whether QM is correct or not but whether violations of Bell's inequalities by experiments can be interpreted to mean that there is "action at a distance". And by showing that a locally causal explanation of violations of Bell's inequalities is possible, it brings into serious question claims that violations of Bell's inequalities implies anything of the sort being suggested.

For example, Rowe's experiment sees violation of a Bell inequality with detection of the full sample of ions.http://www.nature.com/nature/journal/v409/n6822/full/409791a0.html

So instead of detection efficiency problems, they have detector accuracy problems. In any case, their experiment does not solve the issue I raised previously, the one I outlined in post #110 here (https://www.physicsforums.com/showpost.php?p=2766980&postcount=110). Fair sampling is not just about 100% detection. It also requires that the data is indexed properly.

So my point is that it is "easy" to get around Bell by predicting a difference with QM. But that very difference leads to immediate conflict with experiment. That is why Bell's Theorem is so important.

Is it your view that Bell's theorem is more important that Bell's inequalities, so that even if Bell's inequalities were shown to not be valid, Bell's theorem will still stand?

DrChinese · Jun 21, 2010

billschnieder said:

1. I just gave you an instruction set explanation for Mermin's gedanken experiment, which he clearly says above is impossible. He did not restrict his statement above to only QM compatible ones, so QM does not even come in for this specific question which I asked.

See previous quote from Mermin himself.

2. Is it your view that Bell's theorem is more important that Bell's inequalities, so that even if Bell's inequalities were shown to not be valid, Bell's theorem will still stand?

1. Bill, you are way too intelligent to be writing stuff like this. Mermin stated clearly that the QM expectation value is too low to match the realistic value. If you are going to intentionally misrepresent an example you obviously follow closely, I really don't know what to say. As I have said previously, Bell (and Mermin) never rule out theories that do NOT match QM predicitions.

2. I don't know about the "more important" part, but yes... Bell's Theorem stands even if the QM predictions were incorrect. So far that hasn't been an issue, but it is always "possible" I guess.

DrChinese · Jun 21, 2010

billschnieder said:

Yes there is justification, which I clearly explained. In All Aspect type experiments ever performed, only a small proportion of the photons are detected. My model is therefore more representative of the real world than Mermin's, or JesseM's in which they do not account for non-detection.

There is already an implicit prediction there, which is contrary to the prediction implicit in the fair sampling assumption. I don't need any extra prediction. The fair sampling prediction is that the particles detected are a fair representation of the universe of particles emitted, my prediction is that the particles detected are not a fair representation of the universe of particles emitted. Even if you detect all particles, you still have to make sure you avoid the pitfall mentioned in post #110. You can measure the full universe of particles and yet not have a fair sampling if you do not index the particles correctly

I did come up with a an instruction set model for Mermin's gedanken experiment did I not? That is what I told you I was going to do. I never promised I would create a full-blown LR quantum theory within a discussion thread. Is that what you were expecting?

You do NOT present a model. A model would be falsifiable. You are just waving your hands. The simulation of De Raedt et al is a step towards a model because it is specific. You do not show in any way how you account for anything. As I showed above, you must demonstrate that your ideas lead to predictions for the sampled photons that match experiment. You completely fail on this count. You must identify where in the apparatus there are parameters that account for "missing" photons. You fail to do that.

And further, you don't even bother to address what a missing photon is. That is not a simple task, as apparently the "missing photons" only show up in pairs! Now, can you even begin to talk about this, since this is a local model? Because there is not one single element of your "model" that matches the experimental evidence. Which would be a prerequirsite for any candidate model.

Go back to the drawing board, you have absolutely nothing in this example that I haven't seen before. Yawn.

JesseM · Jun 21, 2010

response to post #109:

JesseM said:

It often seems like you may be intentionally playing one-upmanship games where you snip out all the context of some question or statement I ask and make it sound like I was confused about something very trivial

billschnieder said:

Pot calling kettle black.

Can you point to any examples where you think I have done something like this? I may misunderstand your meaning at times, but I generally quote and respond to almost all the context that you provided, and in cases where I think you may be confused about something basic I usually adopt a tentative tone and say something like "if you are arguing X, then you're misunderstanding Y".

JesseM said:

This scenario, where there is a systematic bias in how doctors assign treatment which influences the observed correlations in frequencies between treatment and recovery in the sample, is a perfectly well-defined one

billschnieder said:

And this is different from the issue we are discussing how exactly.

It's not! If you looked carefully at the context when I brought up this example, you'd see my point was that in this example, we aren't trying to establish a causal relation between treatment and recovery but just want to know the statistical correlation between the two under the given observable conditions (which include the fact that doctors are assigning treatments). In this case, the systematic bias isn't a problem at all, it's just a part of the experimental conditions that we want to determine probabilities for! Suppose that the frequentist "God" knows it happens to be true that in the limit as the number of patients being assigned treatments by these doctors went to infinity, the fraction of patients who recovered under treatment B (more of whom had small gallstones) would be 82%, and the fraction of patients who recovered under treatment A (more of whom had large gallstones) would be 77%. Then if I, in my sample of 700 patients assigned treatments by doctors, found that 83% of those with treatment B recovered and 78% of those with treatment A recovered, the frequentist God would smile beneficently upon my experiment and say "good show old boy, your measured frequencies were very close to the ideal probabilities you were trying to measure!" Of course, if I tried to claim this meant treatment B was causally more effective the frequentist God would become wrathful and cast me down into the lowest circle of statistician hell (reserved for those who fail to remember that correlation is not causation), but I'd remain in his favor as long as I was humble and claimed only that my observed frequencies were close to the ideal probabilities that would result if the same experimental conditions (including the systematic bias introduced by the doctors, which is not a form of sampling bias since the doctors themselves are part of the experimental conditions) a near-infinite number of times.

And again, my point is that this sort of situation, where we only are interested in the ideal probabilities that would result if the same experimental conditions were repeated a near-infinite number of times, and are not interested in establishing that correlations in observed frequencies represent actual causal influences, is directly analogous to the type of situation Bell is modeling in his equations. Whatever marginal and conditional probabilities appear in his equations, in the frequentist interpretation (which again I think is the only reasonable one to use when understanding what 'probability' means in his analysis) they just represent the frequencies that would occur if the experiment were repeated with the same observable conditions a near-infinite number of times.

billschnieder said:

Haven't I told you umpteenth times that Aspect-type experimenters are unable to make sure there is no systematic bias in their experiments?

Yes, and the type of systematic bias I talk about above (which is different from sampling bias) isn't a problem for these experimenters, which is what I have been trying to tell you at least umpteen times. They are just trying to make sure the observed frequencies in the experiments match the ideal frequencies that would obtain if the experiment were repeated a near-infinite number of times under the same observed macro-conditions (while other unobserved conditions, like the exact state of various micro/hidden variables, would be allowed to vary). As long as any systematic correlation between unobserved conditions and observed results (akin to the systematic correlation between unobserved gallstone size and observed treatment type) in the actual experiment is just a mirror of systematic correlations which would also exist in the ideal near-infinite run, then they want that systematic bias to be there if their observed frequencies are supposed to match the ideal probabilities.

billschnieder said:

How do you expect me to continue a discussion with you if you ignore everything I say and keep challenging every tiny tangential issue, like the meaning of fair, or the meaning of population.

I don't think they are tangential though--as you said in this recent post, "Please, note I am trying to engage in a precise discussions so don't assume you know where I am going with this". The fact that you conflate different meanings of "fair" is actually pretty essential, because it means you falsely argue that the experimenters need to control for different values of hidden variables in a manner akin to controlling for gallstone size in the medical experiment where they're trying to determine the causal effectiveness of different treatments, and the fact that they don't need to "control for" the effects of hidden variables in this way in order to test local realism using Bell inequalities is central to my argument. Likewise the meaning of "population" gets to the heart of the fact that you refuse to consider Bell's analysis in terms of the frequentist view of probability (a thoroughly mainstream view, perhaps the predominant one, despite your attempts to portray my talk about infinite repetitions as somehow outlandish or absurd), where my argument is that the frequentist interpretation is really the only clear way to understand the meaning of the probabilities that appear in the proof (especially the ones involving hidden variables which of course cannot be defined in an empirical way by us ordinary mortals who can't measure them).

billschnieder said:

You think I have all the time in the world to be following you down these rabbit trails which are not directly relevant to the issue being discussed.

Notice that whenever I ask questions intended to clarify the meaning of words like "population" and "fair/biased", I tend to say things like "if you want to continue using this term, please answer my questions"...if you think defining these terms is so "tangential", you have the option of just restructuring your argument to avoid using such terms altogether. Likewise, if you don't want to waste a lot of time on the philosophy of probability, you have the option to just say something like "I personally don't like the frequentist interpretation but I understand it's a very traditional and standard way of thinking about probabilities, and since I want to confront your (and Bells') argument on its own terms, if you think the frequentist interpretation is the best way to think about the probabilities that appear in Bell's proof, I'll agree to adopt this interpretation for the sake of the argument rather than get into a lot of philosophical wrangling about the meaning of probability itself". But if you can't be a little accommodating in ways like these, then this sort of wrangling seems necessary to me.

billschnieder said:

Have you noticed every of your responses is now three posts long

None of my responses to individual posts of yours have gone above two posts, actually.

billschnieder said:

See the previous paragraph for the reason why. I answer the ones that I believe will further the relevant discussion and ignore temptations to go down yet another rabbit trail.

Well, at least in your most recent posts you've addressed some of my questions and the lotto example, showing you aren't just refusing for the sake of being difficult. Thanks for that. And see above for why I do think the issues I ask you are relevant and not tangential. If you both refuse to discuss the meaning of terms and the interpretation of probability and refuse "for the sake of the argument" to stop using the terms and adopt the mainstream frequentist view of probability, then I think there is no way to continue having a meaningful discussion.

JesseM said:

"Rational degree of belief" is a very ill-defined phrase. What procedure allows me to determine the degree to which it is rational to believe a particular outcome will occur in a given scenario?

ThomasT said:

It is well defined to me. If you disagree, give an example and I will show you how a rational degree of belief can be formed. Or better, give an example in which you think the above definition does not apply.

OK, here are a few:

--suppose we have a coin whose shape has been distorted by intense heat, and want to know the "probability" that it will come up heads when flipped, which we suspect will no longer be 0.5 due to the unsymmetrical shape and weight distribution. With "probability" defined as "rational degree of belief", do you think there can be any well-defined probability before we have actually tried flipping it a very large number of times (or modeling a large number of flips on a computer)?

--in statistical mechanics the observable state of a system like a box of gas can be summed up with a few parameters whose value gives the system's "macrostate", like temperature and pressure and entropy. A lot of calculations depend on the idea that the system is equally likely to be in any of the "microstates" consistent with that macrostate, where a microstate represents the most detailed possible knowledge about every particle making up the system. Do you think this is justified under the "rational degree of belief" interpretation, and if so how?

--How do you interpret probabilities which are conditioned on the value of a hidden variable H whose value (and even range of possible values) is impossible to measure empirically? I suppose we could imagine a quasi-omniscient being who can measure it and form rational degrees of belief about unknown values of A and B based on knowledge of H, but this is just as non-empirical as the frequentist idea of an infinite set of trials. So would you say an expression like P(AB|H) is just inherently meaningless? You didn't seem to think it was meaningless when you debated what it should be equal to in the OP here, though. If you still defend it as meaningful I'd be interested to hear how the "rational degree of belief" deals with a totally non-empirical case like this though.

billschnieder said:

My definition above covers both the "frequentists" and "bayesian" views as special cases

How can you view the frequentist view as a "special case" when in their interpretation all probabilities are defined in terms of infinite samples, whereas you seem to be saying the definition of probability should never have anything to do with imaginary scenarios involving infinite repetitions of some experiment?

billschnieder said:

So why are you so surprised when I tell you that such idealized problems, which presuppose infinite independent repetitions of a "random experiment" can not be directly compared to anything real, where infinite repetition of a "random experiment" is not possible?

I'm surprised because here you seem to categorically deny the logic of the frequentist interpretation, when it is so totally mainstream (I noticed on p. 89 of the book I linked to earlier that the whole concept of a 'sample space' comes from Von Mises' frequentist analysis of probability, although it was originally called the 'attribute space') and when even those statisticians who don't prefer the frequentist interpretation would probably acknowledge that the law of large numbers means it is reasonable to treat frequencies in real-world experiments with large samples as a good approximation to a frequentists' ideal frequencies in a hypothetical infinite series of trials. For example, would you deny the idea that if we flip a distorted coin 1000 times in the same style, whatever fraction it comes up heads is likely to be close to the ideal fraction that would occur if (purely hypothetically) we could flip it a vastly greater number of times in the same style without the coin degrading over time?

billschnieder said:

If Bell's theorem were an entirely theoretical exercise with no comparison being made to reality, and no conclusions about reality being drawn from it, do you really believe we would be having this discussion?

Again you strangely act as if I am saying something weird or bizarre by talking about infinite repetitions, suggesting either that you aren't familiar with frequentist thought or that you think a huge proportion of the statistics community is thoroughly deluded if they believe the frequentist definition is even meaningful (regardless of whether they favor it personally). Surely you must realize that the mainstream view says ideal probabilities (based on a hypothetical infinite sample size) can be compared with real frequencies thanks to the law of large numbers, that even if you think I'm wrong to take that view there's certainly nothing novel about it.

JesseM said:

I'm just saying that to talk about "probability" in the frequentist interpretation you need to define the conditions that you are imagining being repeated in an arbitrarily large number of trials.

billschnieder said:

No I don't. You are the one who insists probability must be defined that way not me.

I don't say it "must be", just that it's a coherent view of probability and it's the one that makes the most sense when considering the totally non-empirical probabilities that appear in Bell's reasoning.

JesseM said:

would you agree that when defining the sample space, we must define what process was used to assign treatments to patients, that a sample space where treatment was assigned by doctors would be a different one than a sample space where treatment was assigned by a random number generator on a computer?

billschnieder said:

Yes, I have told you as much recently. But what has that got to do with anything.

It's got to do with the point I made at the start of this post (repeating something I had said in many previous posts), that if you are explicitly defining your sample space in terms of conditions that cause a systematic correlation between the values of observable and hidden variables (like people being more likely to be assigned treatment B if they have small kidney stones) and just trying to measure the probabilities for observable variables in this sample space (not trying to claim correlations between observable variables mean they are having a causal influence on one another, like the claim that the higher correlation between treatment B and recovery means treatment B is causally more effective), then this particular form of "systematic bias" in your experiments is no problem whatsoever! And this is why, in the Aspect type experiments, it's no problem if the hidden variables are more likely to take certain values on trials where observable variables like A took one value (Alice measuring spin-up with her detector setting, say) than another value (Alice measuring spin-down).

JesseM · Jun 21, 2010

response to post #109, continued from previous post:

billschnieder said:

All I am telling you is that you can not compare a probability defined on one sample space with one defined on another. My point is, just because you use a random number generator does not mean you have the same probability space like the idealized infinitely repeated probability you theorized about. What don't you understand about that.

In the frequentist view, all probabilities in a "probability space" (which is just a sample space with probabilities assigned to each point in the space) are ideal ones that would obtain if you were picking from the same sample space an infinite number of times. So using frequentist definitions the above criticism makes no sense, and you already know I'm thinking in frequentist terms. The number of trials has nothing to do with the definition of the sample space, each point in the sample space refers to possible results that could occur on a single trial of a given experiment. This last part is still true under other interpretations of probability--a Bayesian would still make use of a sample space, and the way the sample space was defined would have nothing to do with the number of trials. Do you disagree?

(I'm also not sure if a Bayesian or holder of some other more empirically-based interpretation would make use of the notion of a probability space at all--if they would, it would presumably have to be one where the probabilities assigned to each point in the sample space could be updated with each new trial)

JesseM said:

I'm not asking you to "compare probabilities defined on different probability spaces", and Bell's argument doesn't require you to do that either.

billschnieder said:

Oh, but that is what you are doing by comparing Bell's inequalities with the results of Aspect type experiments whether Bell "requires" it or not.

Well, the sample space doesn't depend on the number of trials, so we're not comparing probabilities defined on different sample spaces, and a probability space is just sample space + probabilities for each point. I suppose if you have the probabilities in the probability space update with each new trial (which wouldn't happen in the frequentist interpretation, but might in others), then with each new trial of an experiment you have a new probability space, but then no experiment whatsoever that involved multiple trials would have a consistent probability space. It would help if you gave an explicit definition of how you define probabilities in a probability space!

billschnieder said:

It is not about what Bell requires, it is about what is done every time a real experiment is compared to Bell's inequalities.

A comparison that a frequentist would say is reasonable if the number of trials is large (and the conditions match the basic ones assumed by Bell), thanks to the law of large numbers.

billschnieder said:

How exactly can actual doctors doing actual experiments repeat the trial to infinity?

They can't, but actual results become ever less likely to differ significantly from the ideal probabilities the more trials are included, that's the law of large numbers. If you are incredulous about this type of thinking you should take your beef to the majority of the statistics community which finds the frequentist interpretation to be at least coherent if not actively preferred.

Why is it that discussions with scientific contrarians so often get bogged down in debates about foundational issues in science/math which pretty much everyone in the mainstream is willing to take for granted, at least for the sake of argument?

billschnieder said:

If you were repeating the experiment an infinite number of times with the random number generator producing two groups every time, then I agree that theoretically, the average composition for both groups will tend to wards the same value.

OK, at least we agree on that.

billschnieder said:

But in the real world, you do not have an infinite number of people with kidney-stones, and it is impossible to repeat the experiment an infinite number of times. Therefore, unless the experimenters know that the size of the stones matter, and specifically control for that, the results of their single experiment, can not be compared to any idealized, theoretical result obtained by repeating a hypothetical experiment an infinite number of times. Is this too difficult to understand?

It's really silly that you ask if your argument is "too difficult to understand" when I'm just giving you the standard understanding of huge numbers of professionals in the statistics community. Again, the standard idea is that the law of large numbers makes it reasonable to treat actual numbers in an experiment with a reasonably large sample size (say, 700) as a good approximation to the ideal probabilities that would obtain if you had an infinite sample size with the same conditions. In the case of your numbers, in post #50 I calculated that if we were using a method that wasn't systematically biased towards assigning different treatments based on kidney stone size (i.e. a method where, in the limit as the number of trials went to infinity, we would expect any correlation between treatment and kidney stone size to approach zero), then the probability that one of the two treatment groups would have 87 or less with small kidney stones (assuming 357 of the original 700 have small kidney stones) would be 1.77*10^-45, an astronomically unlikely statistical fluctuation. Did you think my calculation there was incorrect?

JesseM · Jun 21, 2010

Response to post #110 (also see two-part response to #109 above)

JesseM said:

So, do you agree with my statement that of these two, Only the second sense of "fair sample" is relevant to Bell's argument?

billschnieder said:

The concept that a fair sample is needed to be able to draw inferences about the population from a sample of it is relevant to Bell's argument, irrespective of which specific type of fair sample is at issue in a specific experiment.

You're wrong, see the start of my response to #109. If your "population" consists of an infinite series of trials with the same observable conditions, then if those conditions include causal factors which would lead some values of hidden variables to be systematically correlated with observable ones (like doctors causing treatment B to be correlated with small kidney stones), then your sample does not need to be fair in my first sense that such correlations are avoided (i.e. that all variables besides treatment are equally distributed in the two groups).

billschnieder said:

In post #91 you said the following, numbered for convenience

JesseM said:

As before, you need to explain what "the population" consists of.
1) Again, does it consist of a hypothetical repetition of the same experimental conditions a much larger (near-infinite number of times)? If so, then by definition the actual sample could not be "systematically biased" compared to the larger population, since the larger population is defined in terms of the same experimental conditions.
2) Perhaps you mean repeating similar experimental conditions but with ideal detector efficiency so all particle pairs emitted by the source are actually detected, which would be more like the meaning of the "fair sampling assumption"?

1) Wrong. IF you define the population like that, the actual sample in a real experiment can still be systematically biased compared to the large population, IF those doing the experiment have no way to ensure that they are actually repeating the same experiment multiple times, even if it were possible to actually repeat it multiple times.

Yes, "if". However, in frequentist terms the larger population can consist of an infinite set of experiments where some known aspects are constant, while lots of other unknown aspects are allowed to vary in different points in the sample space. For example, in a statistical mechanics analysis your ideal population might consist of an infinite series of trials which all start with a system prepared in the same initial "macrostate" and thus have the same initial values for macro-variables like temperature and pressure, but with the unknown "microstate" (consisting of the most detailed possible information about every particle in the system, like the exact quantum state of the entire system) being allowed to vary randomly as long as it's consistent with the macrostate. This notion of defining the infinite series (or the complete set of possibilities in the sample space) by holding certain knowns constant while allowing unknowns to vary is discussed in the Stanford Encyclopedia article discussing frequentism--see the paragraph that starts "The beginnings of a solution to this problem..." and which goes on to say that "Von Mises (1957) gives us a more thoroughgoing restriction to what he calls collectives — hypothetical infinite sequences of attributes (possible outcomes) of specified experiments that meet certain requirements." So, in any Aspect-type experiment with certain known conditions, a frequentist could reasonably argue that although there are a lot of unknowns which may vary from one trial to another, the frequencies of different values of all these unknowns would converge on a specific set of probabilities in the limit as the number of trials (with the known conditions applying in every one) approached infinity. And as long as your actual number of experiments was large and the known conditions were satisfied in every one, the observed frequencies of observable attributes would be expected to be close to the ideal probabilities for those same attributes.

billschnieder said:

2) A fair sample in the context of Aspect-type experiments means that the probabilities of non-detection at Alice and Bob are independent of each other, and also independent of the hidden elements of reality.

I think we'll overcomplicate things if we get into the detector efficiency loophole. There are plenty of people who argue that we haven't had a perfect test of local realism because existing tests haven't had perfect efficiency of detection, but who agree with Bell's reasoning to the extent that if an experiment with perfect efficiency was done, they'd agree local realism had been falsified. Presumably you would not be one of them--your critique is a lot more basic! So to avoid an everything-but-the-kitchen sink argument, let's say for the sake of argument that the experiment can be done that closes the various practical loopholes in tests of Bell inequalities mentioned here (and the article does note that 'The exception to the rule, the Rowe et al. (2001) experiment is performed using two ions rather than photons, and had 100% efficiency. Unfortunately, it was vulnerable to the locality loophole'), and focus on why you think even if this type of experiment showed a statistically significant violation of Bell inequalities, you still wouldn't be convinced that local realism was falsified. OK?

JesseM said:

To make the question more precise, suppose all of the following are true:

1. We repeat some experiment with particle pairs N times and observe frequencies of different values for measurable variables like A and B

2. N is sufficiently large such that, by the law of large numbers, there is only a negligible probability that these observed frequencies differ by more than some small amount from the ideal probabilities for the same measurable variables (the 'ideal probabilities' being the ones that would be seen if the experiment was repeated under the same observable conditions an infinite number of times)

3. Bell's reasoning is sound, so he is correct in concluding that in a universe obeying local realist laws (or with laws obeying 'local causality' as Maaneli prefers it), the ideal probabilities for measurable variables like A and B should obey various Bell inequalities

...would you agree that if all of these are true (please grant them for the sake of the argument when answering this question, even though I know you would probably disagree with 3 and perhaps also doubt it is possible in practice to pick a sufficiently large N so that 2 is true), then the experiment constitutes a valid test of local realism/local causality, so if we see a sizeable violation of Bell inequalities in our observed frequencies there is a high probability that local realism is false? Please give me a yes-or-no answer to this question.

billschnieder said:

No I do not agree. The premises you presented are not sufficient (even if they were all true) for the statement in bold to be true. Here is an example I have given you in a previous thread which makes the point clearer

OK, what if I add to condition 1 that the experiment matches all the basic assumptions that were made in deriving the inequality in question--in the case of Bell's original inequality, these would include the assumption that the experimentalists choose randomly between three possible measurements a,b,c, the fact that both the choices of measurement and the measurements themselves are made at a spacelike separation, the implicit assumption of perfect detector efficiency, etc. In that case would you agree? Note that the example you provided does not seem to match the conditions of the Leggett-Garg inequality at all, since the inequality Aa(.)Ab(.) + Aa(.)Ac(.) + Ab(.)Ac(.) >= -1 (assuming that's correct, the wikipedia article seems to say the correct form with three terms would be Aa(.)Ab(.) + Ab(.)Ac(.) - Aa(.)Ac(.) <= 1...there may be other forms though, can you provide a source for yours?) is based on the assumption that we are considering an ensemble of trials where a single system was measured at two of three possible times a,b,c on three occasions, one occasion involving measurements at times a and b (with Qa*Qb being +1 if it's in the same state at both times, -1 if it's in different states at the two times, and Aa(.)Ab(.) just being the average value of Qa*Qb for each trial in the ensemble), the second occasion involving measurements at times a and c, and the third involving measurements at times b and c (see the description of the experiment starting at the bottom of p. 180 in this book, although there they are considering four possible times rather than three). Since your symbols seem to have a totally different meaning, there shouldn't be any reason a physicist would expect the Leggett-Garg inequality to apply to your example with the meaning you've given the symbols.

Response to post #111:

JesseM said:

If the "population" was explicitly defined in terms of an infinite set of repetitions of the exact observable experimental conditions you were using, then by definition your experimental conditions would not show any systematic bias and would thus be a "fair sample". And Bell's theorem doesn't assume anything too specific about the observed experimental conditions beyond some basic criteria like a spacelike separation between measurements (though it may be that 100% detector efficiency is needed as one of these criteria to make the proof rigorous, in which case a frequentist would only say that Bell's inequalities would be guaranteed to hold in an infinite repetition of an experiment with perfect detector efficiency, and any actual experiment with imperfect efficiency could be a biased sample relative to this infinite set)

billschnieder said:

Is it your claim that Bell's "population" is defined in terms of "an infinite set of repetitions of the exact observable experimental conditions you were using"? If that is what you mean here then I fail to see the need to make any fair sampling assumption at all.

In the part in bold I think I made clear that Bell's proof would only apply to the exact observable experimental conditions you were using if it was true that those conditions met the "basic criteria" I mentioned above. I allowed for the possibility that 100% detector efficiency might be one of the conditions needed--DrChinese's subsequent posts seem to say that the original Bell inequalities do require this assumption, although perhaps you can derive other inequalities if the efficiency lies within some known bounds, and he seemed to say that local realist theories which tried to make use of this loophole would need some other physically implausible features. As I said above in my response to #110 though, I would rather keep the issue of the detector efficiency loophole separate from your other critiques of Bell's reasoning, which would seem to apply even if we had an experiment that closed all these known loopholes (and apparently there was one experiment with perfect detector efficiency but it was vulnerable to a separate known loophole).

billschnieder said:

Why would the fact that detectors are not efficient not already be included in what you call "the exact observable experimental conditions you were using"?

If they are not efficient then it is included, and if all you are interested in is trying to find approximate values for the ideal probabilities that would obtain if the same experimental conditions were repeated an infinite number of times, then there's no need to worry about detector efficiency or any of the other conditions used to derive various Bell inequalities. But for each Bell inequality, the physicists deriving it are deducing things about the ideal probabilities that would obtain on an infinite number of trials in a local realist universe if the experiment meets certain conditions. If my experimental conditions X don't match the required conditions Y, then by the law of large numbers the frequencies I observe on a large number of repetitions should be close to the ideal probabilities that would be seen if an experiment with conditions X were repeated an infinite number of times, but there's no reason to believe these ideal probabilities for conditions X will respect any Bell inequality concerning the ideal probabilities for conditions Y. Different experimental conditions will have different ideal probabilities associated with them, there isn't anything surprising about that!

billschnieder said:

So either, 1) that is not what Bell's population is defined as, or 2) No experimental condition testing Bell's inequalities will ever be unfair, so there is no point even making a "fair sampling assumption".

"Fair sampling" can only be defined relative to an infinite "population" of trials in the frequentist view. If you repeat the same measurable conditions on your experiments, then this will automatically be a "fair sample" with respect to an infinite collection of trials with the same conditions, but those conditions may not be the ones that are needed to derive a Bell inequality.

Response to post #112:

billschnieder said:

Representative of the entire dolphin population.

OK, in this case it is easy to have an unfair sample if your sampling method makes it more likely that you will pick certain dolphins over others (ones close to shore over ones far from shore, for example). But as before, in a perfect Aspect type experiments the "population" is just going to be a hypothetical infinite repetition of the same known experimental conditions, as long as those conditions match those required to derive the relevant inequality (existing experiments have been less-than-perfect about matching all the required conditions, but DrChinese talked a bit about why these loopholes aren't too worrying to physicists)

JesseM said:

You can only define "representative" by defining what conditions you are imagining the dolphins are being sampled

billschneider said:

Oh, so you are saying you need to know the "hidden" factors in order to be able to generate a fair sample.

I said nothing of the sort. In fact my full response made pretty clear I think you don't:

I can't answer without a definition of what you mean by "representative sample"--representative of what? You can only define "representative" by defining what conditions you are imagining the dolphins are being sampled in the ideal case of an infinite number of trials. If the fact that *I* am making the selection on a particular date (since the dolphin population may change depending on the date) is explicitly part of these conditions, then the infinite set of trials can be imagined by supposing that we are rewinding history to the same date for each new group of 1000 in the infinite collection, and having me make the selection on that date with the same specified observable conditions. So relative to this ideal infinite set, I can use whatever method I like to select my 1000, because the fact that it's up to me to decide how to pick them is explicitly part of the conditions.

In this example, as long as the "specified observable conditions" were met in my actual sample, it would automatically be a fair sample relative to the ideal infinite case where the same conditions were met in every sample of 1000, despite the fact that all sorts of other unobserved micro-conditions could vary randomly each time history was rewound and I made my selection of 1000 (which via the butterfly effect might lead me to make different choices leading to different combinations of dolphins being included in my selection)

response to #113 and #115:

Again, I acknowledge that most (all?) Bell inequalities include perfect detector efficiency in the assumptions needed for the derivations, but since you seem to have a lot of more basic criticisms of the inequalities and the possibility of testing them even if the detector efficiency loophole was closed (along with other known loopholes), I'd prefer to leave aside this issue for now and focus on what you think would still be wrong with the reasoning even with the loopholes closed.

billschnieder · Jun 22, 2010

JesseM said:

--suppose we have a coin whose shape has been distorted by intense heat, and want to know the "probability" that it will come up heads when flipped, which we suspect will no longer be 0.5 due to the unsymmetrical shape and weight distribution. With "probability" defined as "rational degree of belief", do you think there can be any well-defined probability before we have actually tried flipping it a very large number of times (or modeling a large number of flips on a computer)?

You either want to know the probability or you don't. If you give the exact same scenario above to two people, one of whom has my definition and one of whom has your definition. Then you will see why your definition is useless in the situation. For me then the important question is what is the most reasonable belief that can be formed based only on the information available and the conclusion is, since I have no specific information to tell me that a specific side (say heads) is more likely than the other, I have to assign equal beliefs to both. Therefore, since the only concrete information I have been given is the fact that we have a coin and only two outcomes are possible, I must assign a probability of 0.5 to each side if I am being reasonable. I'm curious what probability you will assign based on exactly same information you have given me. NOTE: performing an experiment is the same as obtaining more information, therefore if you consider the result of any experiment actually performed, I am entitled to use the same information as well in forming my belief. So to be fair you must use exactly the same information you gave me.

--in statistical mechanics the observable state of a system like a box of gas can be summed up with a few parameters whose value gives the system's "macrostate", like temperature and pressure and entropy. A lot of calculations depend on the idea that the system is equally likely to be in any of the "microstates" consistent with that macrostate, where a microstate represents the most detailed possible knowledge about every particle making up the system. Do you think this is justified under the "rational degree of belief" interpretation, and if so how?

Yes of course it is justified. Jaynes, who also defines probability the way I did, has pioneered a lot of work in statistical mechanics. (see http://en.wikipedia.org/wiki/Maximum_entropy_thermodynamics)

--How do you interpret probabilities which are conditioned on the value of a hidden variable H whose value (and even range of possible values) is impossible to measure empirically?

See the link above in the response to the statistical mechanics case you pointed out. I suspect that you are just prejudiced against that definition and haven't given much thought to what it actually means.

How can you view the frequentist view as a "special case" when in their interpretation all probabilities are defined in terms of infinite samples, whereas you seem to be saying the definition of probability should never have anything to do with imaginary scenarios involving infinite repetitions of some experiment?

The frequentist probability is the limit of the relative frequency as the number of trials increases. As you increase the trials, you gain more information about what is likely and what is not, therefore your degree of belief is updated. Believing that the probability of head is 0.5 before the first experiment is ever performed is rational. But for the same coin, if you now have information that 100 trials have been performed and each resulted in a head, believing the probability of heads is 0.5 is not rational, a rational one will be closer to 1, and there are straight-forward objective ways to calculate and update your belief as more information becomes available.

billschnieder · Jun 22, 2010

JesseM said:

OK, what if I add to condition 1 that the experiment matches all the basic assumptions that were made in deriving the inequality in question

I have previously in a not too distant thread mentioned the following:

1) Bell's ansatz correctly represents all possible local-causal theories with hidden elements of reality
2). Bell's ansatz necessarily lead to Bell's inequalities
3). Experiments violate Bell's inequalities
Conclusion: Therefore the real physical situation of the experiments is not Locally causal.

I agree that IF (1), (2) and (3) are all true, then the conclusion is justified. There are several ways to point out the flaws in the argument. You can assume that (1) is true, and then show that the experiments in (3) were not done according to the assumptions implied in (1). Alternatively and equivalently, you could also assume that the experiments in (3) are perfect, and then show why the assumptions in (1) are not accurate representations of what is actually done in real experiments. In both cases, it comes down to the meaning of Bell's ansatz and how comparable it is to real experiments performed. (2) is indisputable -- I do not believe that Bell made a mathematical error in deriving (2) from (1), which is not to say he did not make extra assumptions that must be realized in real experiments for both to be comparable. So again, the only issue is what the equations mean as concerns being comparable to actual results obtained by experimenters.

Note that the example you provided does not seem to match the conditions of the Leggett-Garg inequality at all, since the inequality Aa(.)Ab(.) + Aa(.)Ac(.) + Ab(.)Ac(.) >= -1 (assuming that's correct, the wikipedia article seems to say the correct form with three terms would be Aa(.)Ab(.) + Ab(.)Ac(.) - Aa(.)Ac(.) <= 1...there may be other forms though, can you provide a source for yours?) is based on the assumption that we are considering an ensemble of trials where a single system was measured at two of three possible times a,b,c on three occasions, one occasion involving measurements at times a and b (with Qa*Qb being +1 if it's in the same state at both times, -1 if it's in different states at the two times, and Aa(.)Ab(.) just being the average value of Qa*Qb for each trial in the ensemble), the second occasion involving measurements at times a and c, and the third involving measurements at times b and c (see the description of the experiment starting at the bottom of p. 180 in this book, although there they are considering four possible times rather than three).

You should read Leggett and Garg's original paper (A. J. Leggett and Anupam Garg. Phys. Rev. Lett. 54, 857 (1985)). First of all, my equation is correct. See equation (2a) from the above paper. Secondly, your characterization about time, etc is not correct. There is nothing there limiting it to time. (a,b,c) can be any thing (that you would call a random variable) such as time or detector angle or any other setting. I find the argument that the LG inequality would not apply to my example, rather convenient. Mind you, similar inequalities had been known long before Bell, and had been described in more general form by Boole and Vorob (a soviet mathematician) and applied to many macroscopic scenarios not unlike my example. Besides, the LG inequality was developed precisely because of its applicability to macroscopic situations like my example. See the above paper.

I did not see a specific response to my post #113, #115: Do you agree that it resolves the paradox in your scratch-lotto cards example? And is it a reasonable resolution in your view? If not what is unreasonable in it?

JesseM · Jun 23, 2010

billschnieder said:

You either want to know the probability or you don't. If you give the exact same scenario above to two people, one of whom has my definition and one of whom has your definition. Then you will see why your definition is useless in the situation.

The frequentist definition treats probability as an objective quantity in any situation where you have a clear definition of the conditions you want to repeat over an infinite set of trials (it may become less objective if the definition is a bit fuzzier, like if you just say you want the coin flipped by a human in each trial rather than defining some exact coin-flipping mechanism whose exact behavior on each trial depends on its precise initial microstate). P. 89 of the book on interpretations of probability I linked to earlier says:

Concerning this alleged science of probability, we might first ask: 'what is its subject matter?' Von Mises answers as follows: '... just as the subject matter of geometry is the study of space phenomena, so probability theory deals with mass phenomena and repetitive events' (1950: vii). Von Mises' view of geometry as a science is somewhat controversial. Since, however, no one doubts that mechanics is a branch of science, it might therefore be better to state Von Mises' position as follows. Probability theory is a mathematical science like mechanics, but, instead of dealing with the motions and states of equilibrium of bodies and the forces which act on them, it treats 'problems in which either the same event repeats itself again and again, or a great number of uniform elements are involved at the same time' (Von Mises 1928: 11). This emphasis on collections is in striking contrast to the subjective theory, which considers probabilities to be assigned by specific individuals to particular events. In the frequency theory, probabilities are associated with collections of events or other elements and are considered to be objective and independent of the individual who estimates them, just as the masses of bodies in mechanics are independent of the person who measures them.

The comparison with mass is useful. If we spot a distant asteroid with our telescopes, but even the best telescopes can only resolve it as a small dot of light moving against the background stars, then we can't form any very precise estimate of its mass, but we assume it has some objective mass anyway, and we might be able to get a better estimate of this mass in the future with better telescopes or probes. Similarly with the irregular coin, a frequentist would say that if you want to talk about the probability it will land heads when flipped under some reasonably well-defined conditions, there is some objective truth about this probability, but we can only get a good estimate of this probability by flipping it a reasonably large number of times under the specified conditions (or possibly we can calculate it theoretically using the laws of mechanics, in much the same way that thermodynamic probabilities are calculated theoretically using statistical mechanics) You may not find it very useful to say "the probability exists, we just can't estimate it yet" in this situation, but that's not the same as saying the view is incoherent somehow (presumably you don't think it's incoherent to say the same about an asteroid's mass)

billschnieder said:

For me then the important question is what is the most reasonable belief that can be formed based only on the information available and the conclusion is, since I have no specific information to tell me that a specific side (say heads) is more likely than the other, I have to assign equal beliefs to both. Therefore, since the only concrete information I have been given is the fact that we have a coin and only two outcomes are possible, I must assign a probability of 0.5 to each side if I am being reasonable.

But then your estimate of the probability can depend on what parameter you want to be indifferent towards, right? Consider http://www.math.uah.edu/stat/buffon/Bertrand.xhtml . Suppose we know a chord is going to be selected "randomly" from a unit circle, with one end of the chord at x=1, y=0 and the other at x=X, y=Y, and the center of the circle at x=0, y=0. We know all chords matching these conditions are possible, but we don't know the actual probability distribution. As the article says (it was missing some symbols like smaller than or equal to but I filled them in):

Then we can completely specify the chord by giving any of the following quantities:

* The (perpendicular) distance D from the center of the circle to the midpoint of the chord. Note that 0 <= D <= 1 .
* The angle A between the x -axis and the line from the center of the circle to the midpoint of the chord. Note that 0 <= A <= 2pi.
* The horizontal coordinate X . Note that -1 <= X <= 1 .

The "paradox" lies in the fact that if you choose a uniform probability distribution on any of these parameters, depending on what parameter you choose you get different answers for the probability that the length of the chord will be greater than some set length. Would your approach tell us what specific probability distribution is most "rational" in this situation?

billschnieder said:

Yes of course it is justified. Jaynes, who also defines probability the way I did, has pioneered a lot of work in statistical mechanics. (see http://en.wikipedia.org/wiki/Maximum_entropy_thermodynamics)

Jaynes was just a type of Bayesian, as is explained in subsection F of the Bayes' Theorem and Bayesian Confirmation Theory section from the Stanford Encyclopedia of Philosophy article on "Bayesian Epistemology". With the Bayesian approach to probability, there is the problem of choosing the prior probability distribution that you will then update based on the data from subsequent trials; subsection F distinguishes between "Subjective Bayesians" who "emphasize the relative lack of rational constraints on prior probabilities" (so experimenters are free to use things like intuition when defining the prior distribution) and "Objective Bayesians" who propose rational methods of deciding on the "best" prior distribution to use, and includes Jaynes among these. Perhaps your "rational degree of belief" just means you are a Bayesian who believes there is a single most rational answer for the choice of prior? But you seemed to indicate otherwise in post #109 when you said:

It is well defined to me. If you disagree, give an example and I will show you how a rational degree of belief can be formed. Or better, give an example in which you think the above definition does not apply. My definition above covers both the "frequentists" and "bayesian" views as special cases, each of which is not a complete picture by itself. If you think it does not, explain in what way it does not.

So, can you clarify this? Before any experiments have been done, let's say you and a Bayesian both agree on the prior distribution to assign to different possibilities. As new data comes in, will there ever be a situation where you end up with a different subsequent answer for the probabilities than the Bayesian?

JesseM said:

--How do you interpret probabilities which are conditioned on the value of a hidden variable H whose value (and even range of possible values) is impossible to measure empirically?

billschnieder said:

See the link above in the response to the statistical mechanics case you pointed out. I suspect that you are just prejudiced against that definition and haven't given much thought to what it actually means.

I was asking about hidden variables in that question, not statistical mechanics. In statistical mechanics we assume we know how to define the complete set of microstates compatible with a given macrostate, the same is not true with an undefined hidden-variables theory. Also, I never expressed any "prejudice" against non-frequentist approaches to statistical mechanics like Jayne's, I was just asking how you would justify treating all microstates as equally likely given that you seemed to be saying you had a novel interpretation of probability which does not match any existing school of thought. Subsection F of the article I linked to above also notes that neither Jaynes nor any other "Objective Bayesian" claims that in every situation there is a single most rational choice for the probability distribution:

In the limit, an Objective Bayesian would hold that rational constraints uniquely determine prior probabilities in every circumstance. This would make the prior probabilities logical probabilities determinable purely a priori. None of those who identify themselves as Objective Bayesians holds this extreme form of the view. Nor do they all agree on precisely what the rational constraints on degrees of belief are. For example, Williamson does not accept Conditionalization in any form as a rational constraint on degrees of belief. What unites all of the Objective Bayesians is their conviction that in many circumstances, symmetry considerations uniquely determine the relevant prior probabilities and that even when they don't uniquely determine the relevant prior probabilities, they often so constrain the range of rationally admissible prior probabilities, as to assure convergence on the relevant posterior probabilities. Jaynes identifies four general principles that constrain prior probabilities, group invariance, maximium entropy, marginalization, and coding theory, but he does not consider the list exhaustive. He expects additional principles to be added in the future. However, no Objective Bayesian claims that there are principles that uniquely determine rational prior probabilities in all cases.

Does your "rational degree of belief" imply otherwise, so that you would say in every situation involving probabilities there is a single correct "rational" answer for how to assign them?

Finally, given that Bell includes p(λ) in the integral his equation (2), implying that the different values of λ may have different probabilities, do you really think it makes sense to interpret his argument in terms of a maximum-entropy approach where all values for unknown variables are considered equally likely? Perhaps you are interpreting the argument in terms of something like a half-omniscient being who is able to learn the value of λ on each trial and updates the probabilities based on that? Again, I'm not arguing that non-frequentist approaches to probability are "wrong", just that the frequentist interpretation is a coherent one, and that it's the one that makes the most sense to use when discussing Bell's argument. Please consider again what I said in post #124:

Likewise, if you don't want to waste a lot of time on the philosophy of probability, you have the option to just say something like "I personally don't like the frequentist interpretation but I understand it's a very traditional and standard way of thinking about probabilities, and since I want to confront your (and Bells') argument on its own terms, if you think the frequentist interpretation is the best way to think about the probabilities that appear in Bell's proof, I'll agree to adopt this interpretation for the sake of the argument rather than get into a lot of philosophical wrangling about the meaning of probability itself".

Are you completely unwilling to adopt the frequentist definitions even for the sake of argument? Do you disagree that Bell himself was most likely thinking in terms of some more "objective" notion of probabilities than Bayesianism when he wrote his proof?

billschnieder said:

The frequentist probability is the limit of the relative frequency as the number of trials increases. As you increase the trials, you gain more information about what is likely and what is not, therefore your degree of belief is updated.

But frequentists themselves would distinguish between the "actual" probabilities in a given situation and their beliefs about those probabilities, since they treat probabilities in an objective way. The more trials you do, the more likely it is that your estimate of the probabilities is close to the actual probabilities (thanks to the law of large numbers), but the two are conceptually different.

billschnieder · Jun 24, 2010

JesseM,
I see that you have ignored my treatment of your lotto-cards example in posts #113, #115, as well I have not yet seen a concrete response to the example I posted in post #110 involving the Legget-Garg inequalities. After insisting that I respond to the lotto-cards example many times, it is rather curious that when I finally did, you utter absolutely nothing in response. Or, should I be expecting a response on those specific issues soon?

I see no point continuing with a quibble about tangential issues such as the historical debates about the meaning of probability except to say, your characterization of that debate is one-sided and anyone interested in finding out Jaynes views which I also subscribe to, is welcome to read his book. (https://www.amazon.com/dp/0521592712/?tag=pfamazon01-20), or his numerous published articles available here (http://bayes.wustl.edu/etj/node1.html)

On the other hand, if you want to continue the discussion on topic, about those examples, I am interested.

JesseM · Jun 24, 2010

billschnieder said:

JesseM,
I see that you have ignored my treatment of your lotto-cards example in posts #113, #115, as well I have not yet seen a concrete response to the example I posted in post #110 involving the Legget-Garg inequalities. After insisting that I respond to the lotto-cards example many times, it is rather curious that when I finally did, you utter absolutely nothing in response. Or, should I be expecting a response on those specific issues soon?

You're awfully impatient, I responded to your post #127 just yesterday and was planning to get to #128, including your request that I go back to the lotto example. I did explain why I didn't address it originally--because I already acknowledge the detection loophole, and because it seems to me you have more fundamental objections to Bell and Aspect besides the fact that existing experiments have not closed that loophole, so I'd prefer to focus on the more basic objections.

billschnieder said:

I see no point continuing with a quibble about tangential issues such as the historical debates about the meaning of probability

I never brought up "historical debates" except to try to clarify your ill-defined notion of probability as "rational degree of belief" by trying to understand how it compares to various well-understood meanings. If you don't want to get into a discussion about the meaning of probability, I am going to continue to use frequentist definitions in interpreting Bell's argument, because the frequentist interpretation is the only one that makese sense to me in that context. Are you going to object?

billschnieder said:

except to say, your characterization of that debate is one-sided and anyone interested in finding out Jaynes views which I also subscribe to, is welcome to read his book. (https://www.amazon.com/dp/0521592712/?tag=pfamazon01-20), or his numerous published articles available here (http://bayes.wustl.edu/etj/node1.html)

"My" characterization was taken verbatim from the Stanford Encyclopedia of Philosophy article. Do you disagree with the article that his position is that of an "Objective Bayesian" whose ideas concerning maximum entropy and so forth were just about picking the right prior distribution? Do you disagree that Jaynes did not believe there was a single correct way to pick the prior in every circumstance?

billschnieder · Jun 24, 2010

JesseM said:

I never brought up "historical debates" except to try to clarify your ill-defined notion of probability as "rational degree of belief" by trying to understand how it compares to various well-understood meanings.

Just because you call it ill-defined does not mean it is, as any half-serious attempt at looking at this issue would have revealed, from the numerous works Jaynes has written on this topic. I doubt you have ever read a single article written by Jaynes, yet you authoritatively proclaim that my definition is different from Jaynes? I have given you a link to all his articles. Go there a find me a single quote that contradicts what I have told you, then maybe I will engage in the discussion. I stand by my definition that Probability means rational degree of Belief. This was the view of Bernouli, Laplace, Jeffreys and Jaynes and it is well-understood (except by you). If you want to start a new thread about the meaning of probability, go ahead and I may join you there.

If you don't want to get into a discussion about the meaning of probability, I am going to continue to use frequentist definitions in interpreting Bell's argument, because the frequentist interpretation is the only one that makese sense to me in that context.

If I thought for a second you were truly interested in understanding my view of probability rather than just quibbling, I would have engaged. However, if you are truly interested in understanding a more generally applicable view of Probability theory than your limited view, I dare you to pick a single Jaynes article on probability and read it. In case you want to quibble about this statement, remember that you could not even give me a probability for the damaged coin situation.

So I suggest you read the following article by Jaynes (Jaynes, E. T., 1990, `Probability Theory as Logic, ' in Maximum-Entropy and Bayesian Methods, P. F. Fougère (ed.), Kluwer, Dordrecht, p. 1, http://bayes.wustl.edu/etj/articles/prob.as.logic.pdf). BTW be prepared for an epiphany. But my gut tells me you won't read it, you will assume that you already know his views just because you read some third-party statement about him.

JesseM · Jun 25, 2010

billschnieder said:

JesseM said:

I never brought up "historical debates" except to try to clarify your ill-defined notion of probability as "rational degree of belief" by trying to understand how it compares to various well-understood meanings.

Just because you call it ill-defined does not mean it is, as any half-serious attempt at looking at this issue would have revealed, from the numerous works Jaynes has written on this topic.

I meant that it was "ill-defined" at the time I was bringing up historical views to try to clarify your own in relation to them, that was before your most recent post where you said your view was the same as Jaynes'. Before that you had said nothing of the sort, you just said you saw probability as "rational degree of belief", and you had also said My definition above covers both the "frequentists" and "bayesian" views as special cases whereas I had understood Jaynes to just be a type of Bayesian. Again, do you think the summary of Jaynes' views in Section 4, subsection F of the Stanford Encyclopedia of Philosophy article I linked to is incorrect in some way? If so can you point out where?

JesseM said:

I doubt you have ever read a single article written by Jaynes, yet you authoritatively proclaim that my definition is different from Jaynes?

I have read at least one article by Jaynes on the maximum entropy approach to thermodynamics, but it's true I'm not particularly familiar with his writings, again I am trusting that the summary in the Stanford article is likely to be well-informed.

billschnieder said:

Go there a find me a single quote that contradicts what I have told you, then maybe I will engage in the discussion.

You have told me very little about your views on probability, but you did say your view covers "bayesians" and "frequentists" as special cases, implying it's different from either. I have asked you about this and you ignore my questions, but if you are indeed aligning with neither and claiming this was Jaynes' view too, just look for example at reference 23 at http://bayes.wustl.edu/etj/node1.html where in section 2.3 on pp. 16-19 of the pdf, he very clearly argues against the frequentist or "ergodic" view of the probabilities in statistical mechanics, which he contrasts to more "subjective" views in which probability is defined in terms of beliefs and on p. 24 says:

In seeking to extend a theory to new domains, some kind of philosophy about what the theory "means" is absolutely essential. The philosophy which led me to this generalization was, as already indicated, my conviction that the "subjective" theory of probability has been subjected to grossly unfair attacks from people who have never made the slightest attempt to examine its potentialities; and that if one does take the trouble to rise above ideology and study the facts, he will find that "subjective" probability is not only perfectly sound philosophically; it is a far more powerful tool for solving practical problems than the frequency theory. I am, moreover, not alone in thinking this, as those familiar with the rise of the "neo-Bayesian" school of thought in statistics are well aware.

So here he pretty clearly aligns himself with the neo-Bayesians and against the frequentists, in seeming contrast to your quote above about both being special cases of your view of probability. Perhaps I misunderstood, but like I said you refused to comment further on the quote when I brought it up again a few times. Also, if you do claim that your (and Jaynes') view is not just Bayesianism plus some rules about how to pick the prior probability distribution, it would help if you would address this question of mine from post #129:

Before any experiments have been done, let's say you and a Bayesian both agree on the prior distribution to assign to different possibilities. As new data comes in, will there ever be a situation where you end up with a different subsequent answer for the probabilities than the Bayesian?

billschnieder said:

If I thought for a second you were truly interested in understanding my view of probability rather than just quibbling, I would have engaged.

I am interested in debates on probability only as they related directly to the issue of Bell's proof. Again, it seems rather obvious that Bell cannot have meant his probabilities to be understood in a Bayesian/Jaynesian sense, since we have no information about the value of λ and would thus naturally have to pick a prior where every value is assigned an equal probability, but his equations consistently feature terms like p(λ) which suggest some (unknown by us) probability distribution on values. If you want to continue to object to my interpreting Bell in a frequentist way (keep in mind that in doing so I am not saying there is anything 'wrong' with the Bayesian view in general, just that it can't be what Bell had in mind), I would ask that you please respond to this section of post #129, even if you don't want to respond to any of my other questions about how you are defining probability:

Finally, given that Bell includes p(λ) in the integral his equation (2), implying that the different values of λ may have different probabilities, do you really think it makes sense to interpret his argument in terms of a maximum-entropy approach where all values for unknown variables are considered equally likely? Perhaps you are interpreting the argument in terms of something like a half-omniscient being who is able to learn the value of λ on each trial and updates the probabilities based on that? Again, I'm not arguing that non-frequentist approaches to probability are "wrong", just that the frequentist interpretation is a coherent one, and that it's the one that makes the most sense to use when discussing Bell's argument. Please consider again what I said in post #124:

Likewise, if you don't want to waste a lot of time on the philosophy of probability, you have the option to just say something like "I personally don't like the frequentist interpretation but I understand it's a very traditional and standard way of thinking about probabilities, and since I want to confront your (and Bells') argument on its own terms, if you think the frequentist interpretation is the best way to think about the probabilities that appear in Bell's proof, I'll agree to adopt this interpretation for the sake of the argument rather than get into a lot of philosophical wrangling about the meaning of probability itself".

Are you completely unwilling to adopt the frequentist definitions even for the sake of argument? Do you disagree that Bell himself was most likely thinking in terms of some more "objective" notion of probabilities than Bayesianism when he wrote his proof?

billschnieder said:

However, if you are truly interested in understanding a more generally applicable view of Probability theory than your limited view, I dare you to pick a single Jaynes article on probability and read it.

Like I said many times, I am not interested in a general debate about which is "better", frequentism or Bayesianism. I have never made any general claims that the frequentist view is better or that there's anything wrong with the Bayesian view. I've just claimed that the frequentist view is internally coherent, and that it seems like the best way to understand what Bell meant when he wrote down terms like p(λ). If you disagree with this last modest claim, please respond to my questions from post #129 quoted above.

billschnieder said:

In case you want to quibble about this statement, remember that you could not even give me a probability for the damaged coin situation.

For someone who believes that probabilities have an objective value it's no surprise that there may be situations where we can't form a good estimate the correct value! I already explained this with the analogy of the mass of an asteroid that we can just barely resolve with our telescopes, presumably you don't think it's a problem for the conventional definition of mass that you could not even give a number for the asteroid's mass in this situation.

billschnieder said:

So I suggest you read the following article by Jaynes (Jaynes, E. T., 1990, `Probability Theory as Logic, ' in Maximum-Entropy and Bayesian Methods, P. F. Fougère (ed.), Kluwer, Dordrecht, p. 1, http://bayes.wustl.edu/etj/articles/prob.as.logic.pdf). BTW be prepared for an epiphany. But my gut tells me you won't read it, you will assume that you already know his views just because you read some third-party statement about him.

You seem to want to turn this discussion into some big cage match which is supposed to determine once and for all which definition of probability is "better" in general, as suggested by your "be prepared for an epiphany", as if by reading it I am supposed to experience a kind of religious conversion. But I'm not a "believer" in any particular definition, there are many different views that are internally coherent and well-defined and may have varying degrees of usefulness in different situations. All that is relevant is that when analyzing Bell's theorem, I think the frequentist definitions are best for understanding what he meant, the Bayesian definitions would seem to make the proof fairly incoherent for the reasons I discussed already.

Finally, I wonder about your vague criticism "you will assume that you already know his views just because you read some third-party statement about him". I certainly never claimed to understand all his views, I just assumed the general statements in the Stanford article were correct as those articles are generally very well-researched. Once and for all, are you actually claiming there was any actual error in my summary or the Stanford article's? For example, if you claim it's an error that he was a type of Bayesian with particular ideas about how to define the prior probabilities, the paper you link me to doesn't seem to support your view, just from skimming it he pretty clearly aligns with Bayesians in statements like "The real job before us is to make the best estimates possible from the information we have in each individual case; and since Bayesians already have the solution to that problem, we have no need to discuss a lesser problem." (from p. 7) I'm not going to read the whole paper in detail unless you claim it's directly relevant to the question of whether it makes sense to interpret the probabilities that appear in Bell's proof in non-frequentist terms (in which case I'd expect you to answer my questions about p(λ) from post #129, quoted in this post), or to showing that some specific statement from the Stanford article about Jaynes' views is incorrect (in which case I'd expect you to identify the specific statement). If you can convince me of its relevance to these specific issues by addressing my questions in specific ways rather than just throwing out a lot of broad aggressive statements like in your last post, in that case I'm happy to read it, but like you I don't have infinite time and the paper is rather long.

JesseM · Jun 25, 2010

You asked me to address your scratch lotto argument in more detail, so I'm doing so here:

billschnieder said:

I have modified it to make the symbols more explicit and the issue more clear as follows:

Suppose we have a machine that generates pairs of scratch lotto cards, each of which has three boxes (1,2,3) that, when scratched, can reveal either a cherry or a lemon (C, L). We give one card to Alice and one to Bob, and each scratches only one of the three boxes. Let us denote the outcomes (ij) such that (CL) means, Alice got a cherry and Bob got a lemon). There are therefore only 4 possible pairs of outcomes: CC, CL, LC, LL. Let us denote the pair of choices by Alice and Bob as (ab), for example (11) means they both selected box 1 on their cards, and (31) means Alice selected box 3, and Bob selected box 1. There are therefore 9 possible choice combinations: 11, 12, 13, 21, 22, 23, 31, 32 and 33.

When we repeat this many times, we find that
(a) whenever they both pick the same box to scratch, they always get the same result. That is whenever the choices are, 11, 22 or 33, the results are always CC or LL.
(b) whenever they both pick different boxes to scratch, they get the same results only with a relative frequency of 1/4.

How might we explain this?
We might suppose that there is definitely either a cherry or a lemon in each box, even though we don't reveal it until we scratch it. In which case, there are only 8 possible cards that the machine can produce: CCC, CCL, CLC, CLL, LCC, LCL, LLC, LLL. To explain outcome (a) then, we might say that "hidden" fruit in a given box of one card always matches the hidden fruit in the same box of the other card. Therefore the machine must always send the same type of card to Bob and Alice. However, doing this introduces a conflict for outcome (b) as follows:

Consider the case where the cards sent to Bob and Alice were of the LLC type. Since outcome (b) involves Alice and Bob scratching different boxes, there are six possible ways they could scratch.

12LL (ie, Alices scratches box 1, Bob scratches Box 2, Alice gets Lemon, Bob gets Lemon)
21LL
13LC
31CL
23LC
32CL (ie, Alices scratches box 3, Bob scratches Box 2, Alice gets Cherry, Bob gets Lemon)

Out of the 6 possible outcomes, only 2 (the first two) correspond to the same outcome for both Alice and Bob. Therefore the relative frequency will be 2/6 = 1/3 not 1/4 as observed. This is the case for all the types of cards produced. This is analogous to the violation of Bell's inequalities.

According to JesseM, it is impossible to explain both outcome (a) and outcome (b) with an instruction set as the above illustration shows.JesseM,
Does this faithfully reflect the example you want me to address? If not point out any errors and I will amend as necessary.

That description is fine, though one thing I would add is that in order to derive the inequality that says they should get the same fruit 1/3 or more of the time, we are assuming each chooses randomly which box to scratch, so in the set of all trials the probability of any particular combination like 12 or 22 is 1/9, and in the subset of trials where they picked different boxes the probability of any combination is 1/6. And of course I do not disagree with any of the standard known loopholes in these derivations, like the detection efficiency loophole or the no-conspiracy loophole.

billschnieder said:

(continuing from my last post)
So far, the conundrum is the idea that the only case which explains outcomes (a) produce relative frequencies (1/3) for outcome (b) which are significantly higher than those predicted by QM and observed in experiments (1/4).

There is however one interesting observation not included in the above treament. In all experiments performed so far, most of the particles sent to the detector are undetected. In the situation above, it is equivalent to saying, not all the cards sent to Alice or Bob reveal a fruit when scratched.

The alternative explanation:
A more complete example then must include "no-fruit" (N) as a possible outcome. So that in addition to the four outcomes listed initially (CC, CL, LC, LL) we must add the four cases for which only one fruit is revealed for each pair of cards sent (CN, NC, CL, LC) and the one case in which no fruit is revealed for each pair sent (NN). Interestingly, in real experiments, whenever only one of the pair is detected, the whole pair is discarded. This is purpose of coincidence circuitary used in Aspect-type experiments.

One might explain it by supposing that a "no-fruit" (N) result is obtained whenever Alice or Bob makes an error by scratching the chosen box too hard so that they also scratch off the hidden fruit underneath it. In other words, their scratching is not 100% inefficient. However, no matter how low their efficiencly, if this mistake is done randomly enough, the sample which reveals a fruit will still be representative of the population sent from the card machine, and by considering just those cases in which no mistake was made during scratching (cf. using coincidence circuitary), the conundrum remains. Therefore in this case, the efficiency of the detector does not matter.

There is yet another posibility. What if the "no-fruit" (N) result, is an instruction carried by the card itself rather than a result of inefficient scratching. So that instead of always having either a cherry or a lemon in each box, we allow for the posibility that some boxes are just left empty (N) and will therefore never produce a fruit no matter how efficiently they scratch.

Keeping this in mind, let us now reconsider the LLC case we discussed above, except that the machine has the freedom to generate the pair such that in one card of the pair generated at a time, one of the boxes is empty (N). For example, the card LNC is sent to Alice while the card LLC is sent to Bob. Note that now the machine is no longer sending exactly the same card to both Alice and Bob. The question then is, can this new instruction set explain both outcomes (a) and (b)? Let us verify:

(a) When both Alice and Bob select the same box to scratch, the possible outcomes for the (LNC,LLC) pair of cards sent are 11LL, 33CC, 22NL. However, since the 22NL case results in only a single fruit, it is rejected as an error case. Therefore in every case in which they both scratch the same box and they both reveal a fruit, they always reveal the same fruit. Outcome (a) is therefore explained.

(b) What about outcome (b)? All the possible results for when they select different boxes from the (LNC,LLC) pair are 12LL, 21NL, 13LC, 31CL, 23NC, 32CC. As you can see, in 2 of the 6 possible cases, only a single fruit is revealed. Therefore we reject those two and have only 4 possible outcomes for which they scratch a different box and both of them observe a fruit (12LL, 13LC, 31CL, 32CC). However, in only one of these, do they get the same fruit. Therefore in one out of the four possible outcomes in which they both scratch different boxes and both get a fruit, they get the same fruit (32CC), corresponding to a relative frequency of 1/4, just as was predicted by QM and observed in real experiments.

The same applies to all other possible instruction sets in which the machine has the freedom to put an empty box in one of the boxes of the pair sent out. The conundrum is therefore resolved.

Yes, this is a valid possibility, and it illustrates the detection efficiency loophole. But note that if we assume every pair sent out by the source has an "N" for one of the six hidden variables, that implies that if we looked at the subset of cases where they chose different boxes to scratch (akin to different detector settings), it should be impossible for them to ever get a detector efficiency higher than 2/3--a falsifiable prediction of this model! Of course if you design an improved experiment with a detector efficiency higher than 2/3, you could always explain it by imagining the source is sending out some mix of card pairs with Ns and card pairs with no Ns for any of the six hidden variables. But that would in turn imply the frequency of identical fruits with different settings should be a bit higher than 1/4 even if it could still be lower than the 1/3 predicted by the inequality (which was derived based on the assumption of perfect detection). So there should be some relation between detector efficiency and the maximum possible violation of a given Bell inequality under this type of model, which would allow it to be falsified by experiment even if the detector efficiency isn't 100%. The section of the wikipedia article discussing the efficiency loophole discusses altered inequalities that should hold (assuming local realism and various other conditions used in derivations of Bell inequalities) in the case of imperfect detector efficiency, and says:

With only one exception, all Bell test experiments to date are affected by this problem, and a typical optical experiment has around 5-30% efficiency. The bounds are actively pursued at the moment (2006). The exception to the rule, the Rowe et al. (2001) experiment is performed using two ions rather than photons, and had 100% efficiency. Unfortunately, it was vulnerable to the locality loophole.

Closing all the loopholes simultaneously may be possible in the near future, as suggested by this paper and this one. Still, I have no problem admitting that no existing experiment has closed both the detection efficiency and locality loopholes at the same time, and that both need to be closed for a perfect test (though I suspect it would be difficult or impossible to come up with a local realist theory that could match existing test results in both experiments that closed the efficiency loophole and experiments that closed the locality loophole, and yet wasn't extremely contrived-looking). The need to close these loopholes is agreed on by all the mainstream physicists who agree with other aspects of Bell's argument, which is why I said I'd rather focus on the issues where you depart from the mainstream in seeing "problems" that most physicists do not.

billschnieder · Jun 25, 2010

JesseM said:

Before that you had said nothing of the sort, you just said you saw probability as "rational degree of belief", and you had also said My definition above covers both the "frequentists" and "bayesian" views as special cases whereas I had understood Jaynes to just be a type of Bayesian.

From Jaynes book which I pointed you to earlier, Probability Theory: the logic of science, Preface, page xxii

Jaynes said:

However, neither the Bayesian nor the frequentist approach is universally applicable, so in
the present more general work we take a broader view of things. Our theme is simply: Probability
Theory as Extended Logic. The "new" perception amounts to the recognition that the mathematical
rules of probability theory are not merely rules for calculating frequencies of "random variables";
they are also the unique consistent rules for conducting inference (i.e. plausible reasoning) of any
kind, and we shall apply them in full generality to that end.
It is true that all "Bayesian" calculations are included automatically as particular cases of our
rules; but so are all "frequentist" calculations. Nevertheless, our basic rules are broader than either
of these, and in many applications our calculations do not fit into either category.
To explain the situation as we see it presently: The traditional "frequentist" methods which use
only sampling distributions are usable and useful in many particularly simple, idealized problems;
but they represent the most proscribed special cases of probability theory, because they presuppose
conditions (independent repetitions of a "random experiment" but no relevant prior information)
that are hardly ever met in real problems. This approach is quite inadequate for the current needs
of science.

billschnieder · Jun 25, 2010

JesseM said:

. And of course I do not disagree with any of the standard known loopholes in these derivations, like the detection efficiency loophole or the no-conspiracy loophole.

Did you see the part where I explained that in my treatment, Alice and Bob were 100% efficient at scratching their boxes. That is equivalent to 100% detector efficiency. Remember I said:

One might explain it by supposing that a "no-fruit" (N) result is obtained whenever Alice or Bob makes an error by scratching the chosen box too hard so that they also scratch off the hidden fruit underneath it. In other words, their scratching is not 100% inefficient. However, no matter how low their efficiencly, if this mistake is done randomly enough, the sample which reveals a fruit will still be representative of the population sent from the card machine, and by considering just those cases in which no mistake was made during scratching (cf. using coincidence circuitary), the conundrum remains. Therefore in this case, the efficiency of the detector does not matter.

There is yet another posibility. What if the "no-fruit" (N) result, is an instruction carried by the card itself rather than a result of inefficient scratching. So that instead of always having either a cherry or a lemon in each box, we allow for the posibility that some boxes are just left empty (N) and will therefore never produce a fruit no matter how efficiently they scratch.

So the detector efficiency does not come in here, unless by it, you mean something else that the efficiency of the detector.Conspiracy doesn't come in either. If you disagree point out where you see a conspiracy in my treatment.

Yes, this is a valid possibility, and it illustrates the detection efficiency loophole. But note that if we assume every pair sent out by the source has an "N" for one of the six hidden variables, that implies that if we looked at the subset of cases where they chose different boxes to scratch (akin to different detector settings), it should be impossible for them to ever get a detector efficiency higher than 2/3--a falsifiable prediction of this model!

The model closely matches the experiment, and they both agree so it can not be a loophole, at least not in the experiment or model. The ratio of photons emitted vs detected is actually is 5/6 (6 possible non-detections in 32 possible instructions) ~ 83% of photons emitted. Interestingly, from the wikipedia link you provided, you need to detect at least ~82% of photons emitted in order to conclude that an experiment has violated Bell's, without additional assumptions.

Closing all the loopholes simultaneously may be possible in the near future, as suggested by this paper and this one.

Do you agree that rather than wait for the perfect experiment that matches all the assumptions in Bell's inequalities to be performed, it should be much easier and double to just develop better inequalities that more closely match realizable scenarios? Why hasn't that been done yet in all the years of discussion about loopholes? Don't answer, let me tell you: because whenever this is done, the resulting inequalities are not violated.

JesseM · Jun 25, 2010

billschnieder said:

Did you see the part where I explained that in my treatment, Alice and Bob were 100% efficient at scratching their boxes. That is equivalent to 100% detector efficiency.

When physicists talk about the detector efficiency loophole, they're just talking about the fact that both members of all pairs aren't detected, they're not talking about the ultimate cause of non-detection and whether it has to do with hidden variables or is due to flaws in the detectors. After all, they routinely give precise numbers for the "efficiency" of various detectors, but by definition if hidden variables (local or nonlocal) are causing some of the nondetections you won't know that this was the cause!

billschnieder said:

So the detector efficiency does not come in here, unless by it, you mean something else that the efficiency of the detector.

I mean "the efficiency of the detector" in the sense that physicists talking about Aspect type experiments would normally use that phrase, it's just about observable questions of how many photon pairs are detected, not a claim about the true cause of cases where both members of the pair weren't detected.

billschnieder said:

The model closely matches the experiment, and they both agree so it can not be a loophole, at least not in the experiment or model.

When people talk about "loopholes" in the context of Bell's theorem, they're talking about ways that Bell inequalities can be violated in a local realist universe if various experimental conditions assumed explicitly or implicitly in the derivation (like the condition that both choice of settings and measurements must have a spacelike separation, or the condition that there must be perfect detection of all pairs emitted by the source). In other words, the "loophole" people are talking about is in some oversimplified summary of Bell's theorem like "in a local realist universe you can never have measurements on particle pairs which violate Bell inequalities", it's not meant to be a loophole in the experiment or model.

billschnieder said:

The ratio of photons emitted vs detected is actually is 5/6 (6 possible non-detections in 32 possible instructions) ~ 83% of photons emitted. Interestingly, from the wikipedia link you provided, you need to detect at least ~82% of photons emitted in order to conclude that an experiment has violated Bell's, without additional assumptions.

Probably just a coincidence, as the wikipedia article also specifies that the basis of ~82% is that it's 2*(sqrt(2) - 1).

billschnieder said:

Do you agree that rather than wait for the perfect experiment that matches all the assumptions in Bell's inequalities to be performed, it should be much easier and double to just develop better inequalities that more closely match realizable scenarios? Why hasn't that been done yet in all the years of discussion about loopholes?

If you looked at the section of the wikipedia article I linked to, did you not notice the big equation that gives an analogue of the CHSH inequality for imperfect detection? Or are you not counting that as an example of inequalities that "more closely match realizable scenarios" for some reason?

billschnieder said:

Don't answer, let me tell you: because whenever this is done, the resulting inequalities are not violated.

There have been experiments where the detection efficiency loophole was closed and some Bell inequality was still violated (see here for instance), it's just that the experiments in question didn't adequately ensure there was a spacelike separation between measurements, so there's the loophole that in principle one particle's detection could have sent a hidden "message" to the other particle telling it how to behave. Again, I think it would be an extremely contrived local realist theory that gave correct predictions about both these experiments and also the experiments where the locality loophole was closed (so there was a spacelike separation between measurements) but the detection efficiency loophole wasn't.

JesseM · Jun 25, 2010

billschnieder said:

From Jaynes book which I pointed you to earlier, Probability Theory: the logic of science, Preface, page xxii

OK, most of the preface (and particularly pages xxii - xxiii) can be viewed on google books here. First of all, when he talks about "frequentism" I'm pretty sure he isn't talking about ideal frequentist definitions of the "true probability" in a given experiment being the the frequency in the limit as the number of trials goes to infinity, rather he is talking about practical frequentist methods for estimating probabilities in real-world situations. He also clearly proclaims the superiority of Bayesianism over whatever he means by "frequentism", even if he feels the maximum-entropy method goes beyond both. For example on page xxii he says:

For many years, there has been controversy over 'frequentist' versus 'Bayesian' methods of inference, in which the writer has been an outspoken partisan on the Bayesian side.

And then he goes on to say:

In these old works there was a strong tendency, on both sides, to argue on the level of philosophy or ideology. We can now hold ourselves somewhat aloof from this, because, thanks to recent work, there is no longer any need to appeal to such arguments. We are now in possession of proven theorems and masses of worked-out numerical examples. As a result, the superiority of Bayesian methods is now a thoroughly demonstrated fact in a hundred different areas. One can argue with a philosophy; it is not so easy to argue with a computer printout, which says to us: 'Independently of all your philosophy, here are the facts of actual performance.' We point out this detail whenever there is a substantial difference in the final results. Thus we continue to argue vigorously for the Bayesian methods; but we ask the reader to note that our arguments now proceed by citing facts rather than proclaiming a philosophical or ideological position.

Presumably the "worked-out numerical examples" and "computer printouts" concern practical examples where one is trying to come up with numbers for probabilities based on some finite collection of data, which again pretty clearly suggests he is talking about practical frequentist methods rather than these results disprove the ideal notion that frequencies would converge on some particular value if an experiment were repeated an infinite number of times, and these frequencies-in-the-limit can be defined as the "true" probabilities in the experiment which our practical methods can only imperfectly estimate.

I think the "frequentist methods" here would just be that of looking at the frequency of some outcome in an actual large set of trials of an identically-repeated experiment (which by the law of large numbers should be likely to not be too far off from the frequencies that would be seen if we could hypothetically repeat the experiment an infinite number of times. When he says that frequentist calculations are "particular cases of our rules", I think he just means that this method is fine in certain situations (precisely those where you are repeating an experiment with exactly the same known conditions and no other relevant prior information differing from one trial to another), just that the situations are very limited and that other methods apply to a wider range of situations. See this paragraph:

To explain the situation as we see it presently: The traditional 'frequentist' methods which use only sampling distributions are usable and useful in many particularly simple, idealized problems; however, they represent the most proscribed cases of probability theory, because they presuppose conditions (independent repetitions of a 'random experiment' but no relevant prior information) that are hardly ever met in real problems.

I think Aspect-type experiments do meet these conditions, in that the "same experiment" is being repeated and in each trial there is "no relevant prior information" that is different from one trial to another and which allows us to anticipate what result we are likely to see. So he should have no problem with applying frequentism to these experiments even as a practical method of estimating probabilities, let alone object to the philosophical definition of the "true" probabilities in terms of frequencies in the limit as number of trials goes to infinity (or to the idea that in these experiments the law of large numbers says the estimated probabilities will be increasingly likely to be close to the 'true' probabilities the more trials are performed). Do you disagree with any of this paragraph?

Lastly, on the way in which his method is non-Bayesian, he seems to be saying that in order to use "Bayesian" methods you need a certain level of information about the problem which is not always available. If this information is available then his methods are identical to those of any Bayesian, his ideas are non-Bayesian only when the information is not available:

Before Bayesian methods can be used, a problem must be developed beyond the 'exploratory phase' to the point where it has enough structure to determine all the needed apparatus (a model, sample space, hypothesis space, prior probabilities, sampling distribution). Almost all scientific problems pass through an initial exploratory phase in which we have need for inference, but the frequentist assumptions are invalid and the Bayesian apparatus is not yet available. Indeed, some of them never evolve out of the exploratory phase. Problems at this level call for more primitive means of assigning probabilities directly out of our incomplete information.

So this makes me think the Stanford article was oversimplifying to call him an "Objective Bayesian" without further explanation, but then again it's just an oversimplification rather than being outright false, because he would use ordinary Bayesian methods in any problem where a Bayesian would have enough information to calculate any probabilities.

Finally, regardless of how best to characterize Jaynes' views and what he would think about a frequentist approach to Aspect-type experiments, can you please answer my question from post #129 about whether it makes sense to interpret the probabilities that appear in Bell's own argument as subjective estimates by some observer rather than objective frequencies-in-the-limit? If you do think it makes sense to interpret them in a subjective way, can you explain whether the observer would be a human one who never knows the values of the hidden variables on a given trial and therefore is forced to have p(λ) be a uniform probability distribution, or whether you are imagining some hypothetical observer who does learn the value and can thus update his subjective probability distribution the more trials there are (if so how many hypothetical trials are seen by the hypothetical observer), or some other type of hypothetical observer?

DrChinese · Jun 26, 2010

billschnieder said:

Do you agree that rather than wait for the perfect experiment that matches all the assumptions in Bell's inequalities to be performed, it should be much easier and double to just develop better inequalities that more closely match realizable scenarios? Why hasn't that been done yet in all the years of discussion about loopholes? Don't answer, let me tell you: because whenever this is done, the resulting inequalities are not violated.

This is a very strange thing to say. First, no one is waiting for anything. There is nothing to wait for! Second, are you saying that someone has done an experiment that did NOT violate an expected Bell Inequality and then did not publish it? That is an outrageous suggestion, assuming you are in fact suggesting that - and I certainly hope you aren't.

billschnieder · Jun 27, 2010

DrChinese said:

This is a very strange thing to say. First, no one is waiting for anything. There is nothing to wait for! Second, are you saying that someone has done an experiment that did NOT violate an expected Bell Inequality and then did not publish it? That is an outrageous suggestion, assuming you are in fact suggesting that - and I certainly hope you aren't.

What are you talking about? Please try an understand what I said before you call it strange.

Clearly you do not deny the fact that it is easier to modify a theory to account for real situations, that it is so reproduce a real experiment which fulfils all the assumptions implicit in the derivation of the theory. This is common sense.

Clearly you do not deny that there has not been a perfect Bell test experiment ever! Note, a perfect Bell test experiment is an experiment which realizes all the assumptions implicit in the derivation of Bell's inequalities. So if you as an individual is not waiting for one to be performed soon, then I doubt you have an active interest in the field as you claim.

Finally, it is common sense to realize that, rather than wait for the perfect experiment which closes a loophole, say "detection efficiency loophole", it should be easier to derive new inequalities, which take into account what is really observed, ie the fact that not all photons emitted will be detected. Obviously, if an experiment violates these new inequalities, there would be no talk of any possible "detection efficiency".

But conventional wisdom has been upside down. While deriving the inequalities, Bell assumed 100% efficiency, so the burden is now placed on experimentalists to perform an experiment with similar extremely efficiency. Why do you think it is unreasonable to derive new inequalities which take into account the fact that experiments are not efficient?

Is Bell's Logic Aimed at Decoupling Correlated Outcomes in Quantum Mechanics?

Attachments

Similar threads

Hot Threads

Recent Insights