I Bayesian statistics in science

Sunil · Nov 3, 2021

[Moderator's note: This thread has been split off from a previous thread since its topic is best addressed in a separate discussion. This post has been edited to focus on the topic for separate discussion.]

Jaynes has used in the derivation of the rules of probability as the logic of plausible reasoning in his "Probability Theory: The Logic of Science" the following trick: Instead of defining the rules for your own thinking, he has introduced a robot, some AI, and we have the job for defining the rules of his thinking. The trick is that if we think about a rules for a robot, we will care much more about the consistency of these rules. And the basic assumptions there are consistency rules: If there are several ways to derive something, the result should be the same. For our own reasoning, consistency is (intuitively) secondary.

The same type of reasoning we should apply here too. What should be the rules of physical reasoning for a robot designed to help physicists?

Fra · Nov 3, 2021

Sunil said:

If there are several ways to derive something, the result should be the same. For our own reasoning, consistency is (intuitively) secondary.

The same type of reasoning we should apply here too. What should be the rules of physical reasoning for a robot designed to help physicists?

Or an agent? Then ask what is the problem and consequence of agents making incompatible inferences? Then we are soon friends

/Fredrik

gentzen · Nov 3, 2021

Sunil said:

Jaynes has used ... the following trick: ... The trick is that if we think about a rules for a robot, we will care much more about the consistency of these rules. And the basic assumptions there are consistency rules: If there are several ways to derive something, the result should be the same. For our own reasoning, consistency is (intuitively) secondary.

The same type of reasoning we should apply here too. What should be the rules of physical reasoning for a robot designed to help physicists?

Are you aware of the content of the paper "Quantum mechanics via quantum tomography" that this thread is about? For example, it says in section "5.5 Objectivity":

The assignment of states to stationary sources is as objective as any assignment of properties to macroscopic objects. Thus the knowledge people talk about when referring to the meaning of a quantum state resides in what is encoded in (and hence ”known to”) the model used to describe a quantum system – not to any subjective mind content of a knower!

In particular, as quantum values of members of a quantum measure, all probabilities are objective frequentist probabilities in the sense employed everywhere in experimental physics – classical and quantum. That the probabilities are only approximately given by the relative frequencies simply says that – like all measurements – probability measurements are of limited accuracy only.

No robot, no agent, no "subjective mind content of a knower". The meaning of a quantum state resides in what is encoded in (and hence ”known to”) the model! This is almost exactly the opposite of what Jaynes would tell you.

PeterDonis · Nov 3, 2021

gentzen said:

The meaning of a quantum state resides in what is encoded in (and hence ”known to”) the model!

But isn't the model mind content?

gentzen said:

This is almost exactly the opposite of what Jaynes would tell you.

I think the passage you quote, or at least the interpretation you are giving it, is trading on an ambiguity in the word "subjective".

In fact, what it is describing is the same kind of thing as what Jaynes describes: the "robot" Jaynes describes builds a model of some system, and uses the model to compute probabilities. Those computations are perfectly objective: they are mathematical operations starting from precisely defined initial propositions, and the same operations applied to the same propositions will give the same answers every time.

The only "subjectivity" involved in Jaynes is that different robots in different states of knowledge--meaning, with different sets of data available to them--will have different models, and will therefore make different computations of probabilities because they are starting from different initial propositions. But that is equally true of experimenters doing quantum tomography: their model is built from the information they have obtained from their experiments, and two experimenters who have run different sets of experiments will have different models, and will therefore compute different probabilities. That is every bit as "subjective" as what Jaynes describes. But of course it's not "subjective" at all in the sense of people just arbitrarily choosing probabilities instead of computing them using specified operations from specified initial propositions--and neither is Jaynes.

A. Neumaier · Nov 3, 2021

PeterDonis said:

But isn't the model mind content?

If you call this 'mind content' then all physics and all language is mind content, and the phrase 'mind content' becomes meaningless since it comprises everything.

PeterDonis said:

I think the passage you quote, or at least the interpretation you are giving it, is trading on an ambiguity in the word "subjective".

gentzen's interpretation is exactly what I intended to convey.

PeterDonis said:

In fact, what it is describing is the same kind of thing as what Jaynes describes: the "robot" Jaynes describes builds a model of some system, and uses the model to compute probabilities.

This is not the standard meaning of 'model'. A model is a template in which the parameters are not fixed but to be determined by experiment. In the case of quantum tomography, the model is the Hilbert space chosen to model the quantum system - nothing else. The state is a matrix of parameters that are not determined by the model but by experiments, using the traditional objective, universally agreed statistical techniques. (Unless you think that the publications of the particle data group are not objective but subjective mind content. Then we do not need to discuss further.)

For example, a classical quartic oscillator is a model defined by a Hamiltonian $$H=p^2/2m+kq^2/2 +gq^2/4;$$ its coefficients (mass ##m## and coupling constant ##g##) are the parameters. The claim that a particular oscillator is well described by this within a given accuracy can be decided objectively by making experiments on the oscillator and comparing it with the predictions of the model.

If the model is correct to some accuracy, there will be a parameter setting that matches the prediction, and in the limit of arbitrarily many and arbitrarily accurate measurements, the parameters will be determined uniquely by the experiment. The only subjectivity is the choice of the ansatz for the Hamiltonian. This is the kind of subjectivity you have everywhere in physics. It has nothing to do with probabilities.

Whereas what you declare to be Jaynes' model is the parameters. The correctness of a Jaynes' model can be refuted by experiment unless the model is actually correct within the given accuracy. This can be established with a 5 sigma confidence if enough data are collected. In physics, this counts as objective.

PeterDonis said:

The only "subjectivity" involved in Jaynes is that different robots in different states of knowledge--meaning, with different sets of data available to them--will have different models, and will therefore make different computations of probabilities because they are starting from different initial propositions.

This makes Jaynes approach subjective in a way quantum tomography is not.

In quantum tomography, the state can be determined objectively independent of initial assumptions, by measuring long enough. There is no subjectivity in the parameters, if your parameters do not agree with the true parameters you'll sooner or later get a statistically arbitrarily significant discrepancy with experiments. Again, the only subjectivity is the choice of the model - in this case Hilbert space representing the quantum system. This is the kind of subjectivity you have everywhere in physics. It has nothing to do with probabilities.

PeterDonis · Nov 3, 2021

A. Neumaier said:

This is not the standard meaning of 'model'.

Sure it is. You're just using a different term for it than Jaynes normally uses. See below.

A. Neumaier said:

This makes Jaynes approach subjective in a way quantum tomography is not.

No, it means that in your approach, you have fixed the prior (in Bayesian terms):

A. Neumaier said:

In the case of quantum tomography, the model is the Hilbert space chosen to model the quantum system - nothing else.

Jaynes would agree with you that, having fixed this prior, any given set of experimental data will objectively lead to a unique computation of probabilities. The only possible difference between different people in this case is that they have different posterior data, in which case they might compute different posterior probabilities. But that is not a difference in models (for your definition of "model"); it's a difference in data.

In other words, what you mean by "model" is what Jaynes means by "prior".

A. Neumaier said:

Whereas what you declare to be Jaynes' model is the parameters.

I might not have been clear in my previous post because of the difference in your terminology vs. Jaynes'. Hopefully the above helps to clarify. I don't see any fundamental difference in your approach vs. Jaynes' approach, given that you have fixed the prior.

A question Jaynes might ask is why you have chosen that particular prior; the choice of prior is where the subjectivity enters in in Jaynes' view, but even on that view, one should still have some reasonable ground for choosing a particular prior. Given the role that Hilbert spaces are already understood to play in QM that question should not be hard for you to answer. (Although you, as the author of the thermal interpretation, might also want to explain why you chose the Hilbert space instead of the set of expectation values.)

A. Neumaier · Nov 3, 2021

PeterDonis said:

Jaynes would agree with you that, having fixed this prior, any given set of experimental data will objectively lead to a unique computation of probabilities.

No - neither he nor I would claim that. A given set of experimental data will never lead to a unique computation of probabilities. Different statistical techniques will give different results:

A simple frequentist estimator would be the relative frequency, which is not the probability but a deterministic (uniquely determined) estimate for it.
Jaynes would have to assume in addition to the data a prior for the probabilities (for example a Dirichlet prior) and then compute from data and prior combined a unique posterior for the probabilities. Because it depends on the prior the result is subjective.

But in quantum tomography the goal is not to obtain probabilities but to obtain the parameters of the model, in this case the density matrix. For this lots of different statistical procedures exist, all variants of the basic technique that I discuss in my paper. They produce different results - some more accurate than others, and increasing accuracy given the same data and a limited computational budget counts as scientific progress. The established scientific practice is that the computational technique used is specified together with the results, so that the procedure is objective, i.e., independent of undisclosed knowledge.

Jaynes would have to assume in addition to the data a prior specifying the probability for obtaining a givn density matrix, and then update this prior in the light of the data. This is a very inefficient way to proceed, especially when the Hilbert space is not only of toy dimensions. For a 10 qubit system, the Hilbert space has dimension 1024, the density matrix depends on more than a million variables, and the posterior would be an extremely complicated probability distribution in dimension of more than a million. Huge overkill!

PeterDonis said:

The only possible difference between different people in this case is that they have different posterior data, in which case they might compute different posterior probabilities. But that is not a difference in models (for your definition of "model"); it's a difference in data.

In other words, what you mean by "model" is what Jaynes means by "prior".

?
Jaynes's model is a probability distribution on states, initially the prior.

My model is the Hilbert space. How can it be considered to be a prior??

PeterDonis said:

given that you have fixed the prior.

? I don't have a prior. I have a model (a Hilbert space) and a huge, arbitrarily extensible collection of data. The latter determines (prior-independent) the parameters that are unspecified in the model to an accuracy determined by the data.

PeterDonis said:

A question Jaynes might ask is why you have chosen that particular prior;

Which prior did I choose? When and where?

PeterDonis said:

explain why you chose the Hilbert space instead of the set of expectation values.

The Hilbert space may be, for example, the tensor product of two two-dimensional Hilbert spaces. This Hilbert space enables one to discuss beams of two entangled photons in Bell experiments. This is the model. It models all possible beams of two entangled photons.

To find out which of this continuum of possibilities is actually realized you need to know in addition to the model its state. This knowledge is obtained by quantum tomography. It is objectively determined to some accuracy by sufficiently extensive data. There are many ways to extract this objective knowledge from the data.

Jaynes' Bayesian methods (which would describe the uncertainty remaining in terms of a probability distribution on density matrices) are not among the most used techniques to do this.

PeterDonis · Nov 3, 2021

A. Neumaier said:

in quantum tomography the goal is not to obtain probabilities but to obtain the parameters of the model, in this case the density matrix

In other words, you're computing a posterior estimate for those. Then just substitute "posterior estimate of model parameters" for "posterior estimate of probabilities" in what I posted before. Jaynes explicitly discusses the case of estimating model parameters.

Where Jaynes might differ from you is that, instead of just computing a point estimate for each model parameter, he would compute a probability distribution.

A. Neumaier said:

For this lots of different statistical procedures exist, all variants of the basic technique that I discuss in my paper.

Then how do you choose which one to use?

A. Neumaier said:

The established scientific practice is that the computational technique used is specified together with the results, so that the procedure is objective, i.e., independent of undisclosed knowledge.

Exactly. And Jaynes would agree. But in choosing which computational technique to use, either you've made a subjective choice, or you've made use of some other objective process for making the choice--in which case Jaynes would just include that objective process in his overall analysis. Jaynes would not introduce any additional subjectivity that's not already there in what you're doing.

A. Neumaier said:

Jaynes would have to assume in addition to the data a prior specifying the probability for obtaining a given density matrix

Why? Why would Jaynes have to assume anything that you're not? Either your assumptions plus the data you obtain are sufficient to compute a posterior estimate for what you want (the model parameters), or they're not. If they are, Jaynes would just use them; Jaynes never claims you should make some kind of additional assumption that's not required to compute what you want, just in order to satisfy some preconceived notion of what your process should be. If they're not, then you've left something out.

A. Neumaier said:

Jaynes's model is a probability distribution on states, initially the prior.

It doesn't have to be. It can just as easily be a probability distribution on model parameters. See above.

A. Neumaier said:

My model is the Hilbert space. How can it be considered to be a prior??

Because you've just assumed that Hilbert space is the right model. That's a subjective assumption on your part. Unless you want to justify it based on some kind of argument, in which case the initial assumptions of that argument will be your prior. Sooner or later what you're doing has to bottom out in some subjective choice of initial assumptions.

dextercioby · Nov 3, 2021

What if this subjectivity you (Peter D) invoke is nothing but a logical consequence of trying 100 theoretical models (e.g. Hamilton functions for quartic oscillator) until you find the one which matches experiment? Then you would probably transfer this subjectivity to the human mind who devised the rules of mathematical logic. There is no science done without the human mind, there is no science if you do not attempt to validate a theoretical model, but it should be the goal of science to devise models which can be entrusted, even if there will never be humans or aliens able to test it. Is black hole evaporation by quantum effects science? Will there ever be a human indisputably probing in a man-made and man-financed laboratory the mathematical/physical theory of black-hole evaporation?

A. Neumaier · Nov 3, 2021

PeterDonis said:

It doesn't have to be. It can just as easily be a probability distribution on model parameters.

This is identical. The state (density matrix) is the collection of model parameters.

I never consider probability distributions over states or model parameters. They are overkill. Point estimations (or more complex deterministic estimation procedures) are simpler and generally used.

PeterDonis said:

Because you've just assumed that Hilbert space is the right model. That's a subjective assumption on your part.

In quantum mechanics, assuming a Hilbert space is a must. Otherwise one cannot even begin making predictions. This has nothing to do with Jaynes' priors.

PeterDonis said:

Sooner or later what you're doing has to bottom out in some subjective choice of initial assumptions.

But this is not what Jaynes' theory is about. It is about how to update subjective probability distributions for model parameters when new information arrives.

In contrast, deterministic statistical estimation is concerned with parameter (state) estimation given a fixed collection of data.

PeterDonis said:

Then how do you choose which one to use?

I discuss the limit of arbitrarily much data, in which case the choice does not matter; all asymptotically consistent methods produce the true value. This is the reason one can speak of objectivity. It is the same criterion that is applied in classical physics.

The point of my paper is to show that amount of objectivity in quantum physics is no less than that in classical physics.

Your arguments just imply that classical physics is subjective, according to your standards, since any analysis must make assumptions. But this kind of subjectivity cannot be removed from science. It has nothing to do with the subjectiveness in Jaynes' theory, and is not what scientists mean when they talk of subjectivity of knowledge.

PeterDonis · Nov 3, 2021

A. Neumaier said:

this is not what Jaynes' theory is about.

I think you are mistaken. When I read your description of what you are doing, it looks the same to me as Jaynes's description of what to do in a similar situation. You are just using different terminology, and perhaps making different judgments about what amount of work is necessary (for example, your statement that point estimates of density matrix parameters are sufficient and probability distributions are overkill--though it's quite possible Jaynes would make the same judgment in a similar situation).

A. Neumaier said:

It is about how to update subjective probability distributions for model parameters when new information arrives.

I think your use of "subjective" here is gratuitous and misleading. Probability distributions are not subjective. The only subjectivity is in the initial choice of assumptions, and you state later on in your post (and I agree with you) that assumptions are unavoidable in any area of science. I don't see, as I have already said, that Jaynes would make any assumptions beyond those that you make, in the particular case you discuss. He would just describe the assumptions using different terms.

A. Neumaier said:

In contrast, deterministic statistical estimation is concerned with parameter (state) estimation given a fixed collection of data.

If this statement about "deterministic statistical estimation" is really true, it seems useless to me. What good is a model that can only explain a fixed collection of data and can't be updated when new data comes in?

A. Neumaier said:

Your arguments just imply that classical physics is subjective, according to your standards, since any analysis must make assumptions. But this kind of subjectivity cannot be removed from science and is not what scientists mean when they talk of subjectivity of knowledge.

Then what do scientists mean when they talk about subjectivity of knowledge, and why do you think Jaynes is guilty of it while you are not?

PeterDonis · Nov 3, 2021

A. Neumaier said:

In quantum mechanics, assuming a Hilbert space is a must. Otherwise one cannot even begin making predictions.

Why not?

PeterDonis · Nov 3, 2021

A. Neumaier said:

Jaynes' Bayesian methods (which would describe the uncertainty remaining in terms of a probability distribution on density matrices)

I think you are misunderstanding Jaynes's general method. His general method is not specifically Bayesian; Bayesian inference is a special case of his method (and frequentist inference is in turn, on his view, a special case of Bayesian inference when certain conditions are met). His general method is simply to discover what rules must be followed when making inferences in science if one wants to satisfy certain basic requirements that seem like they would make sense for any scientific inference.

What you are describing is simply what you have found to be the best method for making scientific inference in the special case you describe (inferring a specific density matrix given a Hilbert space and a set of experimental data).

dextercioby · Nov 3, 2021

PeterDonis said:

Why not?

You need a scalar product space (orthogonality of vectors) to account for the probabilistic interpretation and completion to ensure desirable properties for observables (such as convergence of sequences of experimental values).

A. Neumaier · Nov 3, 2021

PeterDonis said:

When I read your description of what you are doing, it looks the same to me as Jaynes's description of what to do in a similar situation.

PeterDonis said:

I think you are misunderstanding Jaynes's general method. His general method is not specifically Bayesian

Please quote from Jaynes and from my paper, so that we have a common ground for comparison. This is better than making fuzzy statements about equivalence of what you think Jaynes is saying.

PeterDonis said:

If your statement about "deterministic statistical estimation" is really true, it seems useless to me. What good is a model that can only explain a fixed collection of data and can't be updated when new data comes in?

The model can discriminate between data that match the model (in which case you get a sensible estimate with which you can make predictions) and data that don't match it (in which case the model assumption is falsified).

When new data comes in one may pool it with the old data to get a bigger set with which to repeat the analysis. No Bayesian (or Jaynesian) machinery is needed for doing this. However, one can use Bayesian thinking to aggregate the old data into a Bayesian prior and then use the new data to calculate a new estimate from this prior and the new data. In important cases (for conjugate priors) this is mathematically equivalent to what frequentist statisticians do under the label of regularization. See, e.g., my paper

A. Neumaier, Solving ill-conditioned and singular linear systems: A tutorial on regularization, SIAM Review 40 (1998), 636-666.

PeterDonis said:

Then what do scientists mean when they talk about subjectivity of knowledge, and why do you think Jaynes is guilty of it while you are not?

Usually they regard engineering practice (i.e., classical mechanics) and engineering level statistics as objective.

The difference is that between objective (frequentist) and subjective (Bayesian) probability. https://en.wikipedia.org/wiki/Bayesian_probability

A. Neumaier said:

In quantum mechanics, assuming a Hilbert space is a must. Otherwise one cannot even begin making predictions.

PeterDonis said:

Why not?

Well, I know how to do predictions with quantum mechanics in a Hilbert space. If you know how to do it without one, please cite a respectable source from which I can learn it.

PeterDonis · Nov 3, 2021

A. Neumaier said:

The difference is that between objective (frequentist) and subjective (Bayesian) probability.

I think Jaynes would have objected to the description of frequentist probability as "objective" and Bayesian as "subjective", since, as I have noted, he considered the former to be a special case of the latter. But that is probably getting too far off topic for this thread. I agree that the process of estimating density matrix parameters from data that you have described is objective (and I think Jaynes would as well).

A. Neumaier said:

Please quote from Jaynes

What I described as Jaynes's general method in post #44 is taken from his Probability Theory: The Logic Of Science, mainly Chapters 1 (towards the end of which he explains the "desiderata" he thinks any rules of reasoning should satisfy) and 2 (where he gives the quantitative rules that those desiderata imply).

The best brief expression of the generality Jaynes claims for the methods in that book is from the Preface (p. xxii, at the bottom):

Jaynes said:

However, neither the Bayesian nor the frequentist approach is universally applicable, so in the present, more general, work we take a broader view of things. Our theme is simply:probability theory as extended logic. The ‘new’ perception amounts to the recognition that the mathematical rules of probability theory are not merely rules for calculating frequencies of ‘random variables’; they are also the unique consistent rules for conducting inference (i.e. plausible reasoning) of any kind, and we shall apply them in full generality to that end.

PeterDonis · Nov 3, 2021

dextercioby said:

You need a scalar product space (orthogonality of vectors) to account for the probabilistic interpretation and completion to ensure desirable properties for observables (such as convergence of sequences of experimental values).

Yes, this is the sort of argument I was talking about. And in Jaynes's terminology, this means you are using Hilbert space as a prior because you have prior information about the kind of phenomena you are modeling, that tells you that you need to use Hilbert space.

This illustrates, btw, that the term "subjective" can be misleading even when referring to the choice of prior (although that term is often used, and I have used it myself), since the considerations that lead to a particular choice of prior can be perfectly objective.

A. Neumaier · Nov 4, 2021

PeterDonis said:

What I described as Jaynes's general method in post #44 is taken from his Probability Theory: The Logic Of Science, mainly Chapters 1 (towards the end of which he explains the "desiderata" he thinks any rules of reasoning should satisfy) and 2 (where he gives the quantitative rules that those desiderata imply).

OK; this explains our misunderstandings. When I referred to Jaynes I meant his paper

Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical review, 106(4), 620.

where he introduced the notions of knowledge and subjective probability to physics. From his abstract:

Edwin Jaynes said:

Information theory provides a constructive criterion for setting up probability distributions on the basis of partial knowledge. [...] In the resulting "subjective statistical mechanics," the usual rules [...] represent the best estimates that could have been made on the basis of the information available.

Thus the assignment subjective to the Bayesian view is Jaynes', not mine!

PeterDonis said:

And in Jaynes's terminology, this means you are using Hilbert space as a prior

This is not how the word prior was used in Jaynes' paper just mentioned, where the usage ageees with the standard usage today in a probabilistic context. Today's usage is given by https://en.wikipedia.org/wiki/Prior_probability
Thus I don't care about the terminology in Jaynes' book. The point of my paper is not a general philosophy of reasoning as in Jaynes' general considerations.
The point of my paper is a proper conceptual foundation of quantum physics with the same characteristic features as classical physics - except that the density operator takes the place of the phase space coordinates.

In my paper I said said:

When a source is stationary, response rates and probabilities can be measured in principle with arbitrary accuracy, in a reproducible way. Thus they are operationally quantifiable, independent of an observer. This makes them objective properties, in the same sense as in classical mechanics, positions and momenta are objective properties. [...]
Everything can be determined and checked completely independent of any subjective knowledge. Nothing subjective remains: Assuming that a quantum system is in a state different from the true state simply leads to wrong predictions that can be falsified by sufficiently long sequences of measurements. Nothing depends on the knowledge of an observer. The latter can be close to the objective truth or far away – depending on how well informed the observer is.
The assignment of states to stationary sources is as objective as any assignment of properties to macroscopic objects. Thus the knowledge people talk about when referring to the meaning of a quantum state resides in what is encoded in (and hence ”known to”) the model used to describe a quantum system – not to any subjective mind content of a knower!
In particular, as quantum values of members of a quantum measure, all probabilities are objective frequentist probabilities in the sense employed everywhere in experimental physics – classical and quantum. That the probabilities are only approximately given by the relative frequencies simply says that – like all measurements – probability measurements are of limited accuracy only.

PeterDonis said:

This illustrates, btw, that the term "subjective" can be misleading even when referring to the choice of prior (although that term is often used, and I have used it myself), since the considerations that lead to a particular choice of prior can be perfectly objective.

As you used it (and the term 'prior'), it is very misleading!

In mainstream physics, one considers the theoretical framework as given, irrespective of what, in his book, Jaynes calls a prior. This includes the model assumptions - typically the phase space in classical physics, the Hilbert space in quantum physics, the causal rules (Galilean in nonrelativistic physics, Minkowski in special relativity, local Minkowski in general relativity), and the parameterized Hamiltonian in conservative mechanics, the equation of motion in dissipative mechanics.

This plays the same role as axioms in mathematics in theorems - it is just a choice of subject matter. There is nothing subjective about this since all choices are made explicit.

A. Neumaier · Nov 4, 2021

PeterDonis said:

What I described as Jaynes's general method in post #44 is taken from his Probability Theory: The Logic Of Science, mainly Chapters 1 (towards the end of which he explains the "desiderata" he thinks any rules of reasoning should satisfy)

My rules of reasoning are those of classical logic, universally applied in mathematics and physics, including probability theory and quantum physics.

Where does Jaynes define the prior in the general sense you claimed? Please give page numbers. ( I have the 2003 edition.)

How does my assumption that the model is given by a Hilbert space and the parameters by a density matrix (which you call a prior) fit Jaynes' desiderata on p.17?

He assumes degrees of plausibilities, but these do not occur in my model assumptions, unless you take the degree to be 100%.

In the main text, the term 'prior information' appears informally on p.6, and semiformally on p.26, where he discusses change of prior information. But my model assumptions never change, hence these rules do not apply. The formal introduction of priors comes only in Chapter 4 (p.119), and then means prior probability distribution in the subjective Bayesian sense as a state of mind of the robot, not in the objective sense of a property of Nature.

PeterDonis · Nov 4, 2021

A. Neumaier said:

When I referred to Jaynes I meant his paper

Ah, ok. This paper is much earlier than the book I referred to, so it's quite possible that Jaynes's own views changed in between.

In general, as I've said, I agree that the process you're describing is objective, so I don't think there is an issue there for this discussion.

A. Neumaier said:

Where does Jaynes define the prior in the general sense you claimed? Please give page numbers.

From pp. 87-88 in my edition:

Jaynes said:

##X## denotes simply whatever additional information the robot has beyond what we have chosen to call ‘the data’

In other words, Jaynes is using "prior" to denote all relevant information other than the "data", which in your example is the data collected by tomography. So Jaynes would include things like the background physical theory you are using in the prior. It is certainly nothing so limited as just an assumed initial probability distribution over model parameters; it also includes all the reasons why you are using a Hilbert space/density matrix model in the first place. The latter information still plays a role in the calculation since it determines the general formulas that are used.

A. Neumaier said:

How does my assumption that the model is given by a Hilbert space and the parameters by a density matrix (which you call a prior) fit Jaynes' desiderata on p.17?

(IIIb) on p. 19: "The robot always takes into account all of the evidence it has relevant to a question." The fact that the model is given by a Hilbert space and the parameters by a density matrix is a consequence of evidence--all the evidence that establishes that those things are the best way to model quantum systems. So using a Hilbert space model with density matrix parameters is necessary in order to take into account all that evidence.

PeterDonis · Nov 4, 2021

A. Neumaier said:

The formal introduction of priors comes only in Chapter 4 (p.119), and then means prior probability distribution in the subjective Bayesian sense as a state of mind of the robot, not in the objective sense of a property of Nature.

In the example you have been describing, you are the robot. The Hilbert space and density matrix parameters are not "properties of Nature". They are states of your mind, and of the minds of all the other scientists that are using your model. Your estimates of the density matrix parameters are the robot's posteriors. If you are thinking of them as "properties of Nature", Jaynes would say you are committing the mind projection fallacy. Your model is not the same as the thing being modeled.

Fra · Nov 4, 2021

PeterDonis said:

Ah, ok. This paper is much earlier than the book I referred to, so it's quite possible that Jaynes's own views changed in between.

PeterDonis said:

So Jaynes would include things like the background physical theory you are using in the prior. It is certainly nothing so limited as just an assumed initial probability distribution over model parameters; it also includes all the reasons why you are using a Hilbert space/density matrix model in the first place. The latter information still plays a role in the calculation since it determines the general formulas that are used.

Jaynes writes on those same pages (p87) in his book also

"But we caution that the term prior is another of those terms from the distant past that can be inappropriate and misleading today"

If we replace the word robot by agent, Jaynes distinction makes good sense, and I use a similar distinction in thinking about "agents". The distinction is what I think of as the difference betwe the agents microstate, and it's microstructure. The state is defined, RELATIVE to the structure. Ie. state vs statespace. In the big inference picture BOTH the state and the SPACE of stats are bound to be updated, but at different time scales. One can also consider the context of general inference and learning that the STRUCTURE is itself merely a "state" in some bigger space. Except that it does not work to parameterized the infinity of future possibiligies. It leads immediately to fine tuning problems. This argument is made also by Lee Smolin in his talks and writings on evolution of law. IMO, the same argument is of relevant in a general learning. This is what distinguishes "optimal data fitting" via some from more intelligent learning. From the perspective to the agent, the evoltuion of the structure has similarities to various dualities where one can transform the dependent variables and get different dynamics. In such a picture it seems reasonable to exepect the hilbert structure as well to be explained, just like the superficial bayesian update of probability, given a FIXED probability space.

I agree it's clear that Jayne includes these genereal background structure, also in the generalized notion of prior information. One could perhaps discuss here if that is "information" vs knowledge or how one should label it, but in the big learning perspective above, the difference should be clear, no matter how we label it.

/Fredrik

A. Neumaier · Nov 4, 2021

PeterDonis said:

From pp. 87-88 in my edition:

Jaynes said said:

X denotes simply whatever additional information the robot has beyond what we have chosen to call ‘the data’

In other words, Jaynes is using "prior" to denote all relevant information other than the "data", which in your example is the data collected by tomography.

No. You are conflating the notions 'prior information' and 'prior' that Jaynes keeps carefully separate:
On p.88, Jaynes distinguishes several distinct items:

Jaynes said said:

Those who are actively familiar with the use of prior probabilities in current real problems usually abbreviate further, and instead of saying ‘the prior probability’ or ‘the prior probability distribution’, they say simply, ‘the prior’. [...] Let us now use the notation
X = prior information,
H = some hypothesis to be tested,
D = the data,

On p.89, he writes:

Jaynes said said:

we need not only the sampling probability
P(D|H X) but also the prior probabilities for D and H:
$$P(H|DX) = P(H|X)\frac{P(D|H X)}{P(D|X) }. ~~~~~~~~~~~~~(4.3)$$
[...] The left-hand side of (4.3), P(H|DX), is generally called a ‘posterior probability’

On pp.108-109, he discusses the dependence on parameters:

Jaynes said said:

In the problem we are discussing, f is simply an unknown constant parameter. [...] There is a prior pdf [...] Then the posterior pdf for f is given by [...]

Thus:

X, called the prior information, is assumed to be fixed, and contains the model assumptions which specify the model and how the parameters enter the model.
H, called the hypothesis, is a question (Boolean function H(f) of the parameters f) to be answered by the analysis.
D, called the data, is experimental information.
P(H|X), called the prior, is the prior probability of H relative to X. Its dependence on the parameters f (discussed later on p.108) is the prior probability distribution for f.
P(H|DX) is the posterior probability of H relative to X, assuming the data D. Its dependence on the parameters f (discussed later on p.108) is the posterior probability distribution for f.

Thus the model assumptions constitute the prior information, and are quite distinct from both the prior (for a parameter-independent hypothesis) and the prior probability distribution, which encodes a subjective assessment of the likelihood of particular value of the parameters. The prior information never figures in the Bayesian probability calculus since it never changes; it only figures in the notation. Indeed, in practice it is suppressed, simplifying the typography of the formulas. Indeed, the latter is already how Jaynes treated the matter in his famous paper.

PeterDonis said:

(IIIb) on p. 19: "The robot always takes into account all of the evidence it has relevant to a question." The fact that the model is given by a Hilbert space and the parameters by a density matrix is a consequence of evidence--all the evidence that establishes that those things are the best way to model quantum systems. So using a Hilbert space model with density matrix parameters is necessary in order to take into account all that evidence.

The robot takes account of the Hilbert space as part of its unchangeable prior information X, not as part of its subjective prior probabilities. The unchangeable part is objective if specified explicitly, since everyone competent will arrive from such a specified X at the same results (in a deterministic calculation from the data) while the Bayesian probabilistic assessment is subjective and remains subjective during all computations. (Apart from being overkill in most applications.)

PeterDonis · Nov 4, 2021

A. Neumaier said:

You are conflating the notions 'prior information' and 'prior' that Jaynes keeps carefully separate

Whenever I have used the term "prior" in this discussion, I have meant "prior information". I apologize for the imprecise use of terminology.

A. Neumaier said:

the Bayesian probabilistic assessment is subjective

Perhaps we are having trouble because of an ambiguity in the word "subjective". If we are going to describe Bayesian probabilities as "subjective", the term can only mean "dependent on the specific information that the robot has". Different robots with different information can compute different Bayesian probabilities.

However, the word "subjective" in common usage has an additional connotation of arbitrariness which is not at all implied or intended in Jaynes's usage. As Jaynes describes it, the process of computing probabilities from a given set of data is perfectly objective; there is no arbitrariness about it at all. There is only one right way to do it. So there is no subjectivity in the sense of arbitrariness in such computations.

The only difference I can see in your own treatment vs. that of Jaynes is that you have said that computing probability distributions is "overkill" and you only need point estimates. And I have already commented that, in a particular case, Jaynes might well agree with such a judgment, since it is a judgment about the benefits vs. the costs of doing additional computations. Ironically, such judgments are the only things we have discussed in this entire thread that are "subjective" in the sense of common usage--that they are personal choices that have an element of arbitrariness to them.

PeterDonis · Nov 4, 2021

A. Neumaier said:

The unchangeable part is objective if specified explicitly, since everyone competent will arrive from such a specified X at the same results (in a deterministic calculation from the data) while the Bayesian probabilistic assessment is subjective and remains subjective during all computations.

In the particular case you describe, since you have declared by fiat that all "robots" involved (all of the scientists assessing some particular instance of quantum tomography) have all of the same prior information and all of the same data, their Bayesian probabilities will obviously all be the same, since you have removed all possible reasons for them to vary.

Remember that my original post in this subthread, post #33, was to object to a claim (made by @gentzen, not you) that your prescription is "opposite" to what Jaynes would say. My point was simply that, in this particular case, Jaynes would say exactly what you are saying. Even the "subjective" element in probabilities--that different "robots" might have different information--is removed in your example. So what you are describing is in fact perfectly consistent with the general method Jaynes describes. It's just a sort of degenerate case of it, since all of the uncertainty involved has been removed--you know exactly what the right model is and exactly what the data is. So everything relevant is exactly known, and it should be no surprise that everyone agrees on it.

A. Neumaier · Nov 4, 2021

PeterDonis said:

If we are going to describe Bayesian probabilities as "subjective", the term can only mean "dependent on the specific information that the robot has".

No. It means that the robot assesses the same data in a robot-specific way, not deducible from objective rules. Whether this way is due to information or to the prior distribution or to goals or to hopes or fears or to whims is secondary.

PeterDonis said:

As Jaynes describes it, the process of computing probabilities from a given set of data is perfectly objective; there is no arbitrariness about it at all. There is only one right way to do it. So there is no subjectivity in the sense of arbitrariness in such computations.

No. The arbitrariness is in the prior, not in the subsequent computations. Moreover, he assumes an ideal robot that functions on the basis of his rational rules; but a real robot cannot do this since the computations would be far too complex.

PeterDonis said:

In the particular case you describe, since you have declared by fiat that all "robots" involved (all of the scientists assessing some particular instance of quantum tomography) have all of the same prior information and all of the same data, their Bayesian probabilities will obviously all be the same, since you have removed all possible reasons for them to vary.

No. They have the same prior information about the physics, but differ in the prior probability assessment (which is the subjective part) and in the degree to which they are faithful to Jaynes' rational rules for manipulating th prior to get the posterior. Indeed scientists are not robots in Jaynes' sense but have goals and preferences that depend not on the data but affect the way they draw conclusions.

PeterDonis · Nov 4, 2021

A. Neumaier said:

No. It means that the robot assesses the same data in a robot-specific way, not deducible from objective rules

I'm sorry, but I simply don't see Jaynes saying this anywhere. His whole book is about figuring out objective rules for the robot to follow for a given problem. He never talks about different robots using different rules for the same problem; he clearly believes that for any given problem, there is one correct set of rules, and that's the set he's looking for.

A. Neumaier said:

The arbitrariness is in the prior

Jaynes spends considerable time discussing the correct ways to assign priors in various situations, so I'm not sure I agree that it is arbitrary. Of course in many real situations the information is far less amenable to being captured in a precise mathematical formulation than it is in the carefully circumscribed physics problem you describe.

A. Neumaier said:

They have the same prior information about the physics, but differ in the prior probability assessment (which is the subjective part)

I don't see how two scientists that are both using the exact same Hilbert space for a given quantum tomography experiment could differ in their computation of ##P(H|X)## for any ##H##.

A. Neumaier said:

in the degree to which they are faithful to Jaynes' rational rules

Of course no real human agent is ever exactly faithful to any set of rules. But you appear to be ruling that out when you talk about the estimates of density matrix parameters from the data being objective in the sense of all scientists involved agreeing on them. That agreement will only happen if they all follow the same rules in doing their computations.

A. Neumaier said:

Indeed scientists are not robots in Jaynes' sense but have goals and preferences that depend not on the data but affect the way they draw conclusions.

If such goals and preferences really do affect the way conclusions are drawn, Jaynes would say (and I would agree) that they should be captured somewhere in the process of doing the computations. If that cannot be done, I would say that the domain under discussion is not (or not yet) a science, because it is not well understood enough. If a physicist were to tell you he doesn't agree with your density matrix parameter estimates from quantum tomography data, you would expect him to give some cogent physics reason like he thinks you're using the wrong Hilbert space for the system. You wouldn't expect him to say it's because he's of a different political party than you, or some other irrelevant factor. But in many domains, things like political beliefs and ideologies certainly do affect the conclusions people come to from a given set of data. We recognize that by not calling those domains sciences.

gentzen · Nov 4, 2021

PeterDonis said:

In fact, what it is describing is the same kind of thing as what Jaynes describes: the "robot" Jaynes describes builds a model of some system, and uses the model to compute probabilities. Those computations are perfectly objective: they are mathematical operations starting from precisely defined initial propositions, and the same operations applied to the same propositions will give the same answers every time.

The only "subjectivity" involved in Jaynes is that different robots in different states of knowledge--meaning, with different sets of data available to them--will have different models, and will therefore make different computations of probabilities because they are starting from different initial propositions. But that is equally true of experimenters doing quantum tomography: their model is built from the information they have obtained from their experiments, and two experimenters who have run different sets of experiments will have different models, and will therefore compute different probabilities. That is every bit as "subjective" as what Jaynes describes. But of course it's not "subjective" at all in the sense of people just arbitrarily choosing probabilities instead of computing them using specified operations from specified initial propositions--and neither is Jaynes.

Sorry for not answering earlier. Writing about Jaynes is tricky for me, because it triggers so many different thoughts. I remembered that I had an email conversation with Kevin van Horn about him, after I commented on https://bayesium.com/probability-theory-does-not-extend-logic/. Here is an extract of the relevant parts:

Sorry for the extremely long delay before answering. ... Jaynes book definitely had some influence on me, even so I mostly disagreed with what he wrote. I am neither Bayesian nor frequentist, instead of an interpretation, I do believe that game theory and probability theory are closely related (https://blog.computationalcomplexit...showComment=1505472807405#c870512924971687938). ...

You ask why I conclude from the interpretation of classical logic as the logic of subsets of a given set that the restriction to a *single* number is basically a bad idea.

My reasoning is simply that even classical logic is not exclusively concerned with a *single* number from {0,1}, but includes the case where we have multiple such numbers. For example, I sometimes use 4 numbers for a proposition: ("actual fact", "judge/state/government version of fact", "opinion of people around me on fact", "my own opinion on fact"). The number for "actual fact" is not always the most relevant, even so it might be the only one of those 4 numbers some people would consider relevant for a logic of plausible reasoning. If "my own opinion on fact" would be the average of the other three numbers, then it would not obey the product rule of probability theory, even if the other three numbers would individually obey the rules of probability theory. (I might try to fix this by using different weights for different contexts. Those weights would then be the relevance of the different versions of "fact" for my own opinion.)

Your recent paper avoids this issue, because it does not assign probabilities to individual propositions, but focuses on the derivability relation X |= A instead. This is good, because that one is really just satisfied or not satisfied, even in predicate logic and non-classical logic. Some non-classical logic might work with sequents (X, Y, ... |= A, B, ...) instead, but even such a sequent is just satisfied or not satisfied.

... For your theorem, you have to explicitly write down all your background knowledge as a propositional formula, and then get the probability for a given proposition (given your background knowledge) as a result. But for the way Cox's theorem is typically used, you can somehow magically encode your background knowledge into a prior (which is a sort of not necessarily normalisable probability distribution), add some observed facts, and then get the probability for a given proposition (given your prior and your observations) as a result.

Of course, this is a caricature version of the Bayesian interpretation, but people do use it that way. And they use it with the intention to convince other people. So what strikes me as misguided is not when people like Scott Aaronson use Bayesian arguments in addition to more conventional arguments to convince other people, but when they replace perfectly fine arguments by a supposedly superior Bayesian argument and exclaim: "This post supersedes my 2006 post on the same topic, which I hereby retire." For me, this is related to the philosophy of Cox's theorem that a single number is preferable over multiple independent numbers (https://philosophy.stackexchange.co...an-reasoning-related-to-the-scientific-method). On the other hand, when Jaynes explains how to obtain (improper) priors for certain situations (https://bayes.wustl.edu/etj/articles/prior.pdf), I do get deeply impressed and include it in my "day to day" reasoning strategies.

The passage from that paper that most influenced my "day to day" reasoning was:

For example, in a chemical laboratory we find a jar containing an unknown and unlabeled compound. We are at first completely ignorant as to whether a small sample of this compound will dissolve in water or not. But having observed that one small sample does dissolve, we infer immediately that all samples of this compound are water soluble, and although this conclusion does not carry quite the force of deductive proof, we feel strongly that the inference was justified. Yet the Bayes-Laplace rule leads to a negligible small probability of this being true, and yields only a probability of 2/3 that the next sample tested will dissolve.

This theme that there can be situations where a single measurement is already very convincing also reappeared in A. Neumaier's thermal interpretation.

I read Jaynes' book back in 2000, but didn't come very far. I guess I stopped in the 3rd chapter. Somehow I got the impression that I wouldn't get those Bayesian insights from it that I had hoped for. The best place to get those insights in a compressed form I have found so far was: https://windowsontheory.org/2021/04/02/inference-and-statistical-physics/. I did read some of Jaynes' papers, and those were a totally different experience for me: always very succinct and rewarding.

PeterDonis · Nov 4, 2021

gentzen said:

I read Jaynes book back in 2000, but didn't come very far.

An unfortunate thing about the book is that it was not finished when Jaynes died. I suspect that if he had lived long enough to finish it, it would be tighter and more like his papers than it is.

PeterDonis · Nov 4, 2021

gentzen said:

that paper

I note, btw, that the paper you reference (the one titled "Prior Probabilities") has as its explicit purpose to remove "arbitrariness" in assigning prior probabilities.

A. Neumaier · Nov 4, 2021

PeterDonis said:

His whole book is about figuring out objective rules for the robot to follow for a given problem. He never talks about different robots using different rules for the same problem; he clearly believes that for any given problem, there is one correct set of rules, and that's the set he's looking for.

Not 'there is' but 'there should be'! Jaynes argues about rules robots should follow, rather than the rules they actually follow. Jaynes' rules are normative (desiderata), not descriptive (facts). Moreover, even the rules he gives all depend on the prior probability assignment, which is subjective, according to Jaynes' own testimony on p.44:

Jaynes (my italics) said:

In the theory we are developing, any probability assignment is necessarily ‘subjective’ in the sense that it describes only a state of knowledge, and not anything that could be measured in a physical experiment. Inevitably, someone will demand to know: ‘Whose state of knowledge?’ The answer is always: ‘That of the robot – or of anyone else who is given the same information and reasons according to the desiderata used in our derivations in this chapter.’
Anyone who has the same information, but comes to a different conclusion than our robot, is necessarily violating one of those desiderata.

But reality is so complex that many things require qualitative judgment - something that cannot be formalized since (unlike probability, where there is a fair consensus about the basic rules to apply) there is no agreement among humans about how to judge. This is why different scientists confronted with the same data can come to quite different conclusions. Violating these desiderata is a necessity. I want to have a philosophy of probability (and of quantum physics) that reflects actual practice, not a wish list.

PeterDonis said:

I don't see how two scientists that are both using the exact same Hilbert space for a given quantum tomography experiment could differ in their computation of P(H|X) for any H.

They differ in the results whenever they differ in the prior. The prior is by definition a probability distribution hence subjective = robot-specific (in Jaynes' scenarios), a state of the mind of the robot. No two robots will have the same state of the mind unless they are clones of each other, in every detail that might affect the course of their computations.

It is a truism that the states of the mind of two scientists is far from being the same. Scientists are individuals, not clones.

PeterDonis said:

when you talk about the estimates of density matrix parameters from the data being objective in the sense of all scientists involved agreeing on them. That agreement will only happen if they all follow the same rules in doing their computations.

Agreement only means agreement to some statistical accuracy appropriate for the experiments analyzed. I didn't claim perfect agreement.

In my paper I talk about the standard statistical procedures (non-Bayesian, hence violating the desiderata of Jaynes): Using simple relative frequencies to approximate the probabilities that enter the quantum tomography process, and then solving the resulting set of linear equations. In the case of N independent measurements) the tomography results will have (by the law of large numbers) an accuracy of ##O(N^{-1/2})## with a factor in the Landau symbol depending on the details of the statistical estimation procedure and the rounding errors made. A lot of ingenuity goes into making the factor small enough so that reasonably accurate results are possible for more than the tiniest system, which explains why scientists using the same data but different software will get slightly different results. But the details do not matter to conclude that in principle, i.e., allowing for arbitrarily many experimental and exact computation, the true state (density operator) can be found with as close to certainty as one likes. This makes the density operator an objective property of the stationary quantum beam studied, in spite of the different results that one gets in actual computations. The differences are comparable in nature of the differences one gets when different scientists repeat a precisely defined experiment - measurement results are well-known to be not exact, but what is measured is nevertheless thought of (in the model) as something objective.

PeterDonis said:

If a physicist were to tell you he doesn't agree with your density matrix parameter estimates from quantum tomography data, you would expect him to give some cogent physics reason like he thinks you're using the wrong Hilbert space for the system.

Yes, and the cogent reason is that he uses different software and/or weighted the data differently because of this or that judgment, but gets a result consistent with the accuracies to be expected. There are many examples of scientists measuring tabulated physical constants or properties, and the rule is that different studies arrive at different conclusions. Even when analyzing the same data.

No competent physicist would use a wrong Hilbert space, but there are reasons why someone may choose a different Hilbert space than I did in your hypothetical setting: For efficient quantum tomography you need to truncate an infinite-dimensional Hilbert space by a subspace of very low dimensions, and picking this subspace is a matter of judgment and can be done in multiple defensible ways. Results differ. With time, some methods (and details inside the methods) prove to give more accurate or more robust results, and these become standard until superseded by even better methods.

Quantum chemical calculations of ground state energies of molecules are a well-known example where depending on the accuracy wanted you need to choose different schemes, and results are never exactly reproducible unless you use the same software with the same parameters, and in case of quantum Monte Carlo calculations also the same random number generator and the same seed.

PeterDonis said:

explicit purpose to remove "arbitrariness" in assigning prior probabilities.

Jaynes does not succeed in this. There is a notion of noninformative prior for certain classes of estimation problems, but this gives a good prior only if (from the point of view of a frequentist) it resembles the true probability distribution. (Just as in quantum tomography, 'true' makes sense in cases where one can in principle draw arbitrarily large samples of independent realizations.) The reason is that no probability distribution is truly noninformative, so if you don't have past data (or extrapolate from similar experiences) whatever prior you pick is pure prejudice or hope.

PeterDonis · Nov 4, 2021

@A. Neumaier, we clearly have very, very different readings of Jaynes, and this subthread is going well off topic. As far as the specific scenario and methods discussed in your paper, and which you have explained in some detail in your post #68, I don't have anything to add to what I have already said. I certainly am not questioning the overall method of determining quantum density matrix parameters by quantum tomography that you describe.

Sunil · Nov 4, 2021

gentzen said:

Are you aware of the content of the paper "Quantum mechanics via quantum tomography" that this thread is about?

Yes, I'm aware that this subthread about the consistency of the thinking of those who reject realism and causality in Bell discussions but use in in everyday life and in other scientific questions is already off-topic. I have referred to Jaynes because his use of the "robot" solves a similar problem with this inconsistency of human thinking in another domain - plausible reasoning. Everyday plausible reasoning is vague, and also often inconsistent, and nobody cares about that inconsistency because it is anyway vague.

gentzen said:

No robot, no agent, no "subjective mind content of a knower". The meaning of a quantum state resides in what is encoded in (and hence ”known to”) the model! This is almost exactly the opposite of what Jaynes would tell you.

Jaynes is not about "subjective mind of a knower". This sounds like you don't understand the difference between de Finetti's subjective Bayesian interpretation and Jaynes' objective Bayesian interpretation. It is an essential one. Jaynes is about what is rational to conclude given some incomplete information. So, if you have no information about a dice, that means, no information which makes a difference between the numbers, you have to assume the probability ##1/6## for each number. In subjective probability you are free to start with whatever you think is fine. Maybe the probability of 3 is higher because it is a sort of Holy Number? Fine, assign it a higher probability. What is fixed is only the updating. So, computing priors - the probability if there is no information - is meaningless in subjective probability, but is a key in objective probability.

Fra · Nov 4, 2021

You declared already that you do not consider inside agents/robots in this way, the only "agent/robot" you consider is the one defined by the "scientific community". This is fine with me.
But just to reflect a bit more on the connection in the light of the following posts...

A. Neumaier said:

Jaynes argues about rules robots should follow, rather than the rules they actually follow. Jaynes' rules are normative (desiderata), not descriptive (facts).
...
This is why different scientists confronted with the same data can come to quite different conclusions. Violating these desiderata is a necessity. I want to have a philosophy of probability (and of quantum physics) that reflects actual practice, not a wish list.

A. Neumaier said:

Results differ. With time, some methods (and details inside the methods) prove to give more accurate or more robust results, and these become standard until superseded by even better methods.

Tou describe the evolution of the robots(or scientific models if you wish) (ie modifying X)?
This would also suggest a population of robots exhibiting a small variation in X, but those that get in conflict with the "bulk" will likely get destabilised.

In this perspective, in line with your second parahraph above, violating the objective rules is a necessity. And if you think about it, the "objectivity" at the level of your agent (scientific community) is a kind of democracy within the community. Ie. it is not sufficient that a random researcher makes a discover unless it can be reproduced by others etc. This is "agent democracy" condition, it's not a constraint. I don't think anyone would think of consistency among researchers as a constraint, becuase the progression of science requires variation, and thus disagreement.

In return, if we get and asymptotically stable population, one would expect Jayens objectivity to apply to the subset of all indistinguishable robots. (Like we say expect all electrons to behave alike, in similar experiments; we expect all trained physicists to make the same calculations etc). This is when we also see stable rules and laws for each robot as per the classification.

This to me, is the actual practice in science, so I would (for the reasons you also mention) prefer to include the variation of X also the philosophy of inference and physics? This is where where my motivation ends.
I have failed to see a simpler way forward.

/Fredrik

A. Neumaier · Nov 5, 2021

Fra said:

the "objectivity" at the level of your agent (scientific community) is a kind of democracy within the community.

No. Democracy leads to permanent error if the majority is in error.

Rather, the scientific community is a meritocracy of scientists. The best scientists have in the long run the most influence.

Fra said:

one would expect Jaynes objectivity to apply to the subset of all indistinguishable robots.

Yes, but unlike electrons, all robots are distinguishable since they necessarily operate in distinct environments hence gather distinct information. Thus Jaynes' setting is a fiction.

Fra said:

to include the variation of X also the philosophy of inference and physics?

X is called 'the state of the art'. It varies according to a stochastic jump process with jumps whenever a new paper or book is published. Even specifying what X is is a subjective decision, with the result that individual scientists disagree as to what the state of the art actually consists of.

A. Neumaier · Nov 5, 2021

PeterDonis said:

@A. Neumaier, we clearly have very, very different readings of Jaynes

Yes, indeed. But I have given detailed arguments for my reading.

PeterDonis said:

and this subthread is going well off topic.

We could continue the discussion in a subthread if you split the present one. Putting posts 56 and later into the new subthread would be a good way to split.

Fra · Nov 5, 2021

A. Neumaier said:

No. Democracy leads to permanent error if the majority is in error.

Rather, the scientific community is a meritocracy of scientists. The best scientists have in the long run the most influence.

Observer democracy is the metafor I use to label the mechanisms behind emergent objectivity. It does not mean litterality "voting" etc, nor an arithmetic average process. It is more to be understood as survival of the fittest, just they way you mention. The process of the negotiating a common consensus is the democratic process.

But the main point of the "subjective inference" side is one that I think sets us apart:

How does an agent construct the instrinci measure implict in your notion of "error", without relying on the feedback from the environment (ie other agents/scientist)? The core motivatior for the subjective inference is that there exist not absolut truth. THIS is what is "desiderata", an idealisation of the scientific process that IMO does not quite correspond to what actually happens. Presuming the existence of an external measure or judge, IMO, misguides us in trying to solve open problems. (That's not the say the emergent objectivity will not satisfy your wish for FAPP). It's a matter of constructing principle. This is a key between objective and subjective inference view. In this sense, objective approach gives priority to the RESULT as constraining the process, while the subjective approach gives priority to the PROCESS over prejudies over which RESULTS we will find.

I have no illusion that we will change in others minds here, but I think this illustrates the difference in views.

/Fedrik

Sunil · Nov 5, 2021

A. Neumaier said:

But reality is so complex that many things require qualitative judgment - something that cannot be formalized since (unlike probability, where there is a fair consensus about the basic rules to apply) there is no agreement among humans about how to judge.

Complexity is not the point here. The rules of probability theory can be applied also in arbitrary complex situations.

A. Neumaier said:

They differ in the results whenever they differ in the prior. The prior is by definition a probability distribution hence subjective = robot-specific (in Jaynes' scenarios), a state of the mind of the robot. No two robots will have the same state of the mind unless they are clones of each other, in every detail that might affect the course of their computations.

The prior is the state without information. It is the state with maximal entropy. This is unproblematic and well-defined. Entropy is a well-defined function on the space of probability distributions, not? To compute that probability distribution for sufficiently complex configuration spaces may be problematic.

There are some situations where one can reasonably use unbounded priors, which are not probabilities but contain less information than any probability distribution, like ##dx## on the straight line. But in these cases one can also use, instead, probabilities smeared out over extremely large regions, and the size of this region will not make a difference.

A. Neumaier said:

Jaynes does not succeed in this. There is a notion of noninformative prior for certain classes of estimation problems, but this gives a good prior only if (from the point of view of a frequentist) it resembles the true probability distribution. The reason is that no probability distribution is truly noninformative, so if you don't have past data (or extrapolate from similar experiences) whatever prior you pick is pure prejudice or hope.

Why should Bayesians care about what frequentists think? You maximize non-informativeness by maximizing entropy.

gentzen · Nov 5, 2021

PeterDonis said:

Remember that my original post in this subthread, post #33, was to object to a claim (made by @gentzen, not you) that your prescription is "opposite" to what Jaynes would say. My point was simply that, in this particular case, Jaynes would say exactly what you are saying.

I feared that it was me who drove you into that discussion, and that I should better also write something, despite the excellent explanations of A. Neumaier. By "opposite" I meant that Jaynes pushes a Bayesian interpretation of probability, but that A. Neumaier's thermal interpretation contains ideas that I see as a rehabilitation of the frequentist interpretation:

gentzen said:

The work of A. Neumaier contains important ideas and elaborations how to overcome critical issues of the frequentist interpretation. One of those critical issues is that it doesn't apply to single systems, but only to ensembles. And part of the solution is to be realistic about the precision of magnitudes (including probabilities) appropriate for the concrete situation you want to talk about.

gentzen said:

..., and tries to avoid one circularity that could arise in a (virtual) frequentist interpretation of probability (with respect to uncertainties). Let me quote from section "4. A view from the past" from an article about earthquake prediction which I have reread (yesterday and) today:

We will suppose (as we may by lumping several primitive propositions together) that there is just one primitive proposition, the ‘probability axiom,’ and we will call it A for short. ...
Now an A cannot assert a certainty about a particular number n of throws, such as ‘the proportion of 6’s will certainly be within p ± ϵ for large enough n (the largeness depending on ϵ)’. It can only say ‘the proportion will lie between p ± ϵ with at least such and such probability (depending on ϵ and n₀ ) whenever n > n₀ ’. The vicious circle is apparent.

PeterDonis said:

Even the "subjective" element in probabilities--that different "robots" might have different information--is removed in your example. So what you are describing is in fact perfectly consistent with the general method Jaynes describes. It's just a sort of degenerate case of it, ...

A weakness of the Bayesian interpretation is that it has no criteria for the situations where probabilities fail to be "objective". The earthquake prediction example above is such a case. Regarding situations where the frequentist interpretation applies and gives objective meaning to probabilities as a special case of Bayesian reasoning misses the point of the frequentist interpretation.

PeterDonis said:

An unfortunate thing about the book is that it was not finished when Jaynes died. I suspect that if he had lived long enough to finish it, it would be tighter and more like his papers than it is.

PeterDonis said:

I note, btw, that the paper you reference (the one titled "Prior Probabilities") has as its explicit purpose to remove "arbitrariness" in assigning prior probabilities.

Another explanation is that the paper "Prior Probabilities" argues for an objective Bayesian interpretation. The first two chapter of the book on the other hand argue for a Bayesian interpretation without yet distinguishing between subjective and objective Bayesian.

Sunil said:

Jaynes is not about "subjective mind of a knower". This sounds like you don't understand the difference between de Finetti's subjective Bayesian interpretation and Jaynes' objective Bayesian interpretation. It is an essential one. Jaynes is about what is rational to conclude given some incomplete information. ... In subjective probability you are free to start with whatever you think is fine. ... What is fixed is only the updating. So, computing priors - the probability if there is no information - is meaningless in subjective probability, but is a key in objective probability.

This comment made me realize that there might be a deeper reason why I liked Jaynes papers, but had trouble with his book (at least with the first few chapters, where the robot appears), see above.

The trouble with the objective Bayesian interpretation is that it only works in special cases. There exists some knowledge when it works and when it breaks down. I even assumed that A. Neumaier knew more details about this than me, but I am no longer sure after reading his reply: "There is a notion of noninformative prior for certain classes of estimation problems, ..." On the other hand, maybe his reply was basically identical to my "And it is a difficult to understand technical issue." except that he tried to explain it nevertheless:

gentzen said:

In the end, this is the place where I see the non-intuitiveness emerge again in his interpretation. In a certain sense, I believe that he knows this, ... But in a certain sense, it would really be unfair if this would be held against his interpretation, because the issue was there all the time for the Bayesian interpretation, and they coped with it by simply staying silent. And it is a difficult to understand technical issue. To get some feeling for how technical, see for example:
Consistency and strong inconsistency of group-invariant predictive inferences (1999) by Morris L. Eaton and William D. Sudderth
Dutch book against some `objective' priors (2004) by Morris L. Eaton and David A. Freedman

Fra · Nov 5, 2021

gentzen said:

A weakness of the Bayesian interpretation is that it has no criteria for the situations where probabilities fail to be "objective".

Maybe I totally missed your point(?) but how about that objective bayesian view makes sense if and when there exists a non-lossy transformation between the state spaces of different agents, so that one can claim than any agent is equivalent ot another one.

The divider here, is that the objective approch presumes this, and tries to use it as a constraint.

The subjective approach does not presume this, one instead see this transformations which makes symmetries manifest as occasionally emergent. An one can expect this emergence would be spontaneous as agents are allowed to interaction, as a kind steady state of self organistaion.

There is a simple reason I think why subjective views are annoying - they are harder to capture with a fixed math, as the state space is not fixed. If one insistes on this, one is forced to make the state space unreaonsably big from start; which gives a fine tuning problem, where the agent may diverge instead of learn.

This is how I see it.

/Fredrik

PeterDonis · Nov 5, 2021

A. Neumaier said:

We could continue the discussion in a subthread if you split the present one.

I could do that but I'm not sure which forum to put it in. The subthread in question isn't really about quantum interpretations, or indeed about quantum mechanics specifically at all. Perhaps the probability and statistics forum over in the math section would be best.

A. Neumaier · Nov 5, 2021

PeterDonis said:

I could do that but I'm not sure which forum to put it in. The subthread in question isn't really about quantum interpretations, or indeed about quantum mechanics specifically at all. Perhaps the probability and statistics forum over in the math section would be best.

In my opinion no mathematics is involved. It is more about the philosophy of science, which would fit here.

PeterDonis · Nov 5, 2021

A. Neumaier said:

In my opinion no mathematics is involved. It is more about the philosophy of science, which would fit here.

Fair enough. I'll look at splitting it off into a separate thread in this subforum.

PeterDonis · Nov 5, 2021

PeterDonis said:

Fair enough. I'll look at splitting it off into a separate thread in this subforum.

Split has been done, this post is now in the new thread.

PeterDonis · Nov 5, 2021

A. Neumaier said:

I have given detailed arguments for my reading.

You have given your interpretations of what you have quoted from Jaynes. But your interpretation of the things you quote is very different from mine. That is why I said we have very different readings. As one example, I have already explained how we appear to be taking the word "subjective" as Jaynes uses it to mean very different things.

A. Neumaier said:

reality is so complex that many things require qualitative judgment

The particular situation your paper in the original thread describes is not such a situation. No qualitative judgments of the sort that are required in domains that are not hard sciences, are required to estimate density matrix parameters from quantum tomography data. The only judgment that seems to me to be required is about what approximation to use given that exact calculations are computationally intractable. See further comments below.

A. Neumaier said:

the cogent reason is that he uses different software and/or weighted the data differently because of this or that judgment, but gets a result consistent with the accuracies to be expected.

In other words, they are using different approximations to a computationally intractable exact solution.

A. Neumaier said:

there are reasons why someone may choose a different Hilbert space than I did in your hypothetical setting: For efficient quantum tomography you need to truncate an infinite-dimensional Hilbert space by a subspace of very low dimensions, and picking this subspace is a matter of judgment and can be done in multiple defensible ways.

I'm not sure I understand. For any system on which we could do quantum tomography, won't there be one unique finite-dimensional Hilbert space? For example, if I have two qubits, possibly entangled (and I want to use quantum tomography to determine whether they are entangled), isn't the Hilbert space just ##\mathbb{C}^2 \times \mathbb{C}^2##?

gentzen · Nov 8, 2021

Fra said:

Maybe I totally missed your point(?) but how about that objective bayesian view makes sense if and when there exists a non-lossy transformation between the state spaces of different agents, so that one can claim than any agent is equivalent ot another one.

The divider here, is that the objective approch presumes this, and tries to use it as a constraint.

I didn't try to elaborate my point. Maybe a good way to elaborate the point "in my style" would be to point out that the "interpretation of probability in terms of observed frequencies" gets defended despite the "hints" that "mathematically it can only remain an intuitive notion":

Stephen Tashi said:

As to an interpretation of probability in terms of observed frequencies, mathematically it can only remain an intuitive notion. The attempt to use probability to say something definite about an observed frequency is self-contradictory except in the trivial case where you assign a particular frequency a probability of 1, or of zero. For example, it would be satisfying to say "In 100 tosses of a fair coin, at least 3 tosses will be heads". That type of statement is an absolute guaranteed connection between a probabilty and an observed frequency. However, the theorems of probability theory provide no such guaranteed connections. The theorems of probability tell us about the probability of frequencies. The best we can get in absolute guarantees are theorems with conclusions like ##lim_{n\to\infty}Pr(E(n))=1##. Then we must interpret what such a limit means. Poetically, we can say "At infinity the event is guaranteed to happen". But such a verbal interpretation is mathematically imprecise and, in applications, the concept of an event "at infinity" may or may not make sense.

As a question in physics, we can ask whether there exists a property of situations called probability that is independent of different observers - to the extent that if different people perform the same experiment to test a situation, they (probably) will get (approximately) the same estimate for the probability in question if they collect enough data. If we take the view that we live in a universe where scientists have at least average luck, we can replace the qualifying adjective "probably" with "certainly" and if we idealize "enough data" to be"an infinite amount of data", we can change "approximately" to "exactly". Such thinking is permitted in physics. I think the concept is called "physical probability".

This is the background against which I see the rehabilitation of the frequentist interpretation.

So I don't want to dismiss attempts at an objective Bayesian interpretation, but I would say that they don't even have the goal to rehabilitate the frequentist interpretation.

Fra · Nov 8, 2021

gentzen said:

I didn't try to elaborate my point. Maybe a good way to elaborate the point "in my style" would be to point out that the "interpretation of probability in terms of observed frequencies" gets defended despite the "hints" that "mathematically it can only remain an intuitive notion":This is the background against which I see the rehabilitation of the frequentist interpretation.

So I don't want to dismiss attempts at an objective Bayesian interpretation, but I would say that they don't even have the goal to rehabilitate the frequentist interpretation.

I likely misread your point. Still not sure what you mean. The post you refer to also seems to contain the tension between descriptive vs guiding probability, but there may be more issues.

Anyway, for the record, I do not subscribe to the objective bayesian view. If your point is "critique" against the objective bayesians, then I agree.

From my perspective, I think the notion of frequentist interpretation makes sense only in the sense of descriptive probability, it's roughly the same thing. As I see it, from the agent view, there is no conflict here. The descriptive frequentist view, refers to the "input"(evidence), and the guiding bayesian probability refers to a "tendency" or "odds" that determines the agents dice, the results from an inference. The output of that inference obviously do not have a proper frequentist interpretation. But I also do not se the "problem" with that details.

The descriptive probability, is naturally frequentist if you think in terms of counted events. But the guiding probability represents the "dice" that the agent uses for his random walk; it does not related to "integer counts" of anything.

/Fredrik

WernerQH · Nov 8, 2021

I find the discussion whether probabilities are objective or subjective rather pointless. By necessity they are both, because probabilities are the glue that connects our theories to the real world.

Saying that the velocities of the molecules in a gas are subject to a Maxwellian distribution is surely a probabilistic statement. If physics is not to turn into a meaningless game, the temperature of a gas and probabilities must have objective meaning. On the other hand I concur with de Finetti that probabilities cannot be physical. They are part of our description, not of reality itself. And a physical description is always (to varying degrees) tentative (to use a better term than "subjective"). A velocity distribution can turn out to be non-Maxwellian. It is the departures from thermal equilibrium (from the blackbody spectrum in the case of Fraunhofer lines) that permits inferences about the chemical composition of the solar photosphere, for example.

Much like geometry, probability theory is an essential ingredient of many physical theories. It allows non-exhaustive, "sparse" descriptions of reality, and it provides a check if these descriptions are consistent with what we know.

Fra · Nov 9, 2021

WernerQH said:

I concur with de Finetti that probabilities cannot be physical.

WernerQH said:

I find the discussion whether probabilities are objective or subjective rather pointless.

WernerQH said:

They are part of our description, not of reality itself. And a physical description is always (to varying degrees) tentative (to use a better term than "subjective").

I think we are now discussing philosophy of science, and we may have different views on to what extende the foundations of science relate and are relevant to the foundations of physics. I have the opinon that that foundations of physics has a MUCH deeper connection to the foundations of science than say foundations of chemistry.

This is difficult topics to discuss that invariably naturally raises disagreement.

/Fredrik

A. Neumaier · Nov 9, 2021

WernerQH said:

They are part of our description, not of reality itself.

Everything we do in physics is part of our description, not of reality itself. This doesn't make physics subjective. Descriptive probability is objective whenever the way how one arrives at the probabilities given the data are made explicit.

WernerQH said:

A velocity distribution can turn out to be non-Maxwellian.

But not in the many cases where the data are compatible with a Maxwell distribution.

WernerQH said:

It is the departures from thermal equilibrium (from the blackbody spectrum in the case of Fraunhofer lines) that permits inferences about the chemical composition

The amount of departure that permits the inference is equally objective, since it does not depend on who observes it or makes the analysis.

I Bayesian statistics in science

Similar threads

Hot Threads

A Understanding Barandes' microscopic theory of causality

I Entanglement: is there 'action at a distance' due to measurement?

A Barandes's Unistochastic Refomulation Applied to Entanglement Swapping

I Does Time-Symmetry Imply Retrocausality? How does the Quantum World Say “Maybe”?

I Physicists disagree wildly on what quantum mechanics says about real…

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers