Understanding probability, is probability defined?

In summary, the concept of probability is often described in terms of events, outcomes, and relative frequency, but it is not formally defined in this way. Probability theory is built upon abstract definitions and mathematical principles, rather than concrete observations of relative frequency. While the frequentist interpretation of probability may align with the formalism in some cases, it is not the only way to understand probability. Ultimately, probability theory deals with the likelihood of events occurring, but it does not provide a definitive answer or guarantee of actual outcomes.
  • #36
Stephen Tashi said:
The limit in the theorem is a special kind of limit. That's why there is an "a.s" in the notation of the limit. This is not the kind of limit used in ordinary calculus. So you can't say an "a.s" limit implies the existence of the kind of limit used in ordinary calculus.

Ok, I see, good point. I guess what you are saying is very logical because if it was that existence of the limit was implied, then it would be proof that relative frequencies was probabilities.Thanks for disproving the set inclusion question. I wanted a way to visualize what it meant that something had probability 1. In the text they say "we are essentially certain that the relative frequency will converge to the probability". But how can they say this. Even though we have said that an event have probability 1, we haven't defined that probability 1 means that it must happen have we?

I mean, how are we supposed to interpret the strong law of large numbers/the absolute certain convergence of relative frequency? My attempt earler to show set inclusion failed as you showed. Is there something else "concrete" that can be done to back up the statement "essentially certain".
Is it maybe possible in this case that P(K) = 1 imlies K = S. Because in your case an unncountable number of possibilities, but in this case we can enumerate each case((n=1,nA=1),(n=1,nA=0),(n=2,nA=0). etc.), so get less cardinality than your example. (I am on very thin ice here, but I am just trying to find a way to interpret the result.)
 
Last edited:
Physics news on Phys.org
  • #37
bobby2k said:
In the text they say "we are essentially certain that the relative frequency will converge to the probability". But how can they say this. Even though we have said that an event have probability 1, we haven't defined that probability 1 means that it must happen have we?

The meaning of "essentially certain" hasn't been defined. (For example, what is the difference between "certain" and "essentially certain"?) Of course, if you define it to mean the same thing as "the frequency converges almost surely" in the technical sense of that mathematical definition then you could say we are "essentially certain".

I mean, how are we supposed to interpret the strong law of large numbers/the absolute certain convergence of relative frequency? My attempt earler to show set inclusion failed as you showed. Is there something else "concrete" that can be done to back up the statement "essentially certain".

You have to examine the technical definition of "almost sure" convergence to understand what the law of large numbers says about relative frequency. You can't simplify what it says to a statement that circumvents that definition.

Is it maybe possible in this case that P(K) = 1 imlies K = S. Because in your case an unncountable number of possibilities, but in this case we can enumerate each case((n=1,nA=1),(n=1,nA=0),(n=2,nA=0). etc.), so get less cardinality than your example. (I am on very thin ice here, but I am just trying to find a way to interpret the result.)

Yes, spaces of finite or countably infinite outcomes are different than spaces with uncountably many outcomes. However, if you are considering the probability of something like infinite sequences of coin tosses, each infinite sequence is an outcome and there are an uncountable number of possible sequences.

If you are thinking about probability relates to the mathematical version of reality, you should think about how probability and logic interact. There are fundamental laws of logic that allow certain deductions. For example, "modus ponens":

Given:
If statement A is true then statement B is true
Statement A is true

Conclude: Statement A is true

You can consider whether the probability of truth can be directly incorporated into logic. For example, is it valid to use reasoning like:

GIven
If statement A is true then statement B is true
Statement A is true with probability 1

Conclude:
Statement B is true

I don't know any mainstream versions of logic that allow such deductions.
 
  • #38
Hello again Stephen Tashi, I haven't had so much time to finish this during the semester, but I have some statements I hope you can look at, that maybe sees if I finally got it. I've been reading about the SLLN, and I think I get it now, however I haven't got enough skills yet to go deep into measure theory.

I get now that it is wrong to say that if an event has probability P(A), then it is wrong to say that n(A)/n will converge to P(A), because the sample space which contain all the infinite sequences, also must contain sequences that do not converge, so we have no guarante of convergence.

However is this a good justification for the relative frequency interpretation:?

Assigning probability according to relative frequency.

Assume that an event A has probability P(A) and that we are able to watch a "practical" long sequence of independent experiments, where n(A) is the number of outcomes which is contained in A, and n is the total number of experiments. We assume that this experiment is part of an infinite sequence which we chose to truncate. This sequence is contained in the sample space, which has all infinite sequences, so we do not know if it is a sequence that converges to P(A) or not. However the Borel Strong law of large numbers says that with probability 1, we will have a sequence that converges to P(A).
Even though probability 1 does not imply that we will get convergence, we choose to assume that our sequence is convergent, because we have probability 1 from Borels law. The basis for this assumption is basically that since we have defined the probability axioms the way we have, and associate high probability with events we feel are more likely to happen it is not a bad idea to assume that an event with probability 1, occurs.
Then since we assume we have a convergent series, we say that P(A)≈n(A)/n, even though we have no way of knowing how far we have come since we truncated.
Do you agree with this? That the only reason we assume that a high probability occurs is the fact that it seems logical according to how we defined probability(non-negative, P(S)=1, and mutual disjoint events' probabilites can be added). So if an event has high probability it must in some sense contain outcomes that together are regarded likely to occur. (But offcourse the outcomes may have zero probability if we regard them alone.)
 
  • #39
bobby2k said:
Even though probability 1 does not imply that we will get convergence, we choose to assume that our sequence is convergent, because we have probability 1 from Borels law. The basis for this assumption is basically that since we have defined the probability axioms the way we have, and associate high probability with events we feel are more likely to happen it is not a bad idea to assume that an event with probability 1, occurs.
Then since we assume we have a convergent series, we say that P(A)≈n(A)/n, even though we have no way of knowing how far we have come since we truncated.

Do you agree with this?

How do you want your idea to be criticized? - as a personal way of thinking about the law of large numbers? - or as actual mathematics? The standards for personal ways of looking at things are rather lax. I suppose many people have the personal outlook that probability "is" long term frequency. Your idea is more sophisticated, but it is not precise enough to pass muster as mathematics.

You essentially say that we will assume that an event with probability 1 will actually happen. As I keep saying, the axioms of probability don't deal with the topic of whether things actually happen or not. It's the "word problems" in probability books that tell us stories about things actually happening and samples being taken. This is analogous to the situation in elementary algebra. The laws of algebra don't contains assumptions about peoples' ages or rates of work. But the word problems have stories like "Tom is twice as old as Sue" or "Bill can build a dog house in 5 hours".
 
  • #40
Stephen Tashi said:
How do you want your idea to be criticized? - as a personal way of thinking about the law of large numbers? - or as actual mathematics? The standards for personal ways of looking at things are rather lax. I suppose many people have the personal outlook that probability "is" long term frequency. Your idea is more sophisticated, but it is not precise enough to pass muster as mathematics.

You essentially say that we will assume that an event with probability 1 will actually happen. As I keep saying, the axioms of probability don't deal with the topic of whether things actually happen or not. It's the "word problems" in probability books that tell us stories about things actually happening and samples being taken. This is analogous to the situation in elementary algebra. The laws of algebra don't contains assumptions about peoples' ages or rates of work. But the word problems have stories like "Tom is twice as old as Sue" or "Bill can build a dog house in 5 hours".
I am not trying to give a very rigorous proof but I want to understand why it is acceptable to use probability in practice, so it may be my personal way of thinking. Maybe the answer is not in the axioms, but there must be some "semi-rigorous"-mathematical explanation of why it works. And then these word-problems you mention are very of relevant. Because I do get that probability-theory is very well defined in the real-analysis and measure-theory world, but the link that connects this world the usage in statistics is hard to see.

When I used the wording "choose to assume" I meant it in kind of a practical way, maybe I should have used a wording like highly confident instead.

Then atleast I think the problem is not about relative frequencices, but why we expect something with high probabilities to occur,(because then the relative-frequency follows automatically). (Some months ago I asked you about this in a wrong way, I wanted to prove that if P(B)=1, then "sample space"[itex]\subset[/itex]B, which was wrong.) I get that "expected occurrence" is not implied alone by the axioms and a high probability, so there is maybe something in the modelling world we assume?
Do you accept the argument below(I only work with countable outcomes for simplicity) It is supposed to be an argument of why we expect something with high probability to occur.

mathematical reasoning, here I try to be rigorous :
-Assume that P(A) = 1 or very close to 1.(My point is that it is supposed to be for high probabilities, not necessarily 1.
- Since the sample space is countable, the event A must consist of countable outcomes.
- Since A = {outcomeA 1}[itex]\cup[/itex]{outcomeA 2}..., and {outcomeA i}[itex]\cap[/itex]{outcomeA j} = [itex]\Phi[/itex]. Axiom 3 gives that [itex]\sum P[/itex]({outcomeA i}) = "something very close to 1"

Now this is all we know from the axioms, that the sum of the probabilities of the outcomes is 1 or close to 1, but it says nothing about a tendency for something to "happen".

modelling world, continued:
-We assign probabilities to the simple outcomes in the sample space not only so that they fit the axioms, but also assumed that the probability-function has the property that it gives high value to outcomes that is very likely to occur, and lower values to outcomes that are less likely to occur.
-Now, if we end up with an event B, which has very high probability then the mathematical reasoning says that the sum of the probabilities of the outcomes in the event must be very high. Since we also assumed that outcomes are given a probability function value after how likely they are to happen, then it is intuitive that event B must happen instead of [itex]\bar{B}[/itex], and we are confident, or expect, that event B will happen.

Is this the correct way of looking at it?, in the sense that we in the application/modelling world give probabilities a meaning, and if the probability of a calculated event is high, then the above argument gives a good reason for why we may believe that it is supposed to happen.

PS: I really value your help!
 
Last edited:
  • #41
bobby2k said:
Since we also assumed that outcomes are given a probability function value after how likely they are to happen

What is accomplished by using the phrase "likely to happen" instead of the word "probability"? Are we assuming that a person doing practical applications has a notion of "likely to happen" that is different than the notion of "probability" as defined in mathematics? I agree that "likely to happen" suggests it has some connection to events actually happening or not, but the word "likely" doesn't clarify what that connection is.

then it is intuitive that event B must happen instead of [itex]\bar{B}[/itex], and we are confident, or expect, that event B will happen.

This is the way most people think (in the way that "confident" is defined in ordinary speech - not in the way that "confidence" is defined is statistics). However, I don't see that you have made any sort of argument for thinking this way.

Is this the correct way of looking at it?, in the sense that we in the application/modelling world give probabilities a meaning, and if the probability of a calculated event is high, then the above argument gives a good reason for why we may believe that it is supposed to happen.

You haven't given an argument. You've merely expressed a belief. I'd express the belief this way: "I believe that I live in a world where the observed frequency of high probability events is high." I don't think that's a controversial belief. Note that It expresses the idea that is that there is some physical property of events in the world that is "probability" and that there is some some different property that is "actuality".
 
  • #42
I used the word likely to happen because it seems that probability is just defined as measure, or function, but in order to use probability, we must give the function a meaning of some sort?, and that meaning I assume to be "tendency to happen or something". If I understood you correct, this meaning is not implied by any of the axioms, but if we do not give probability this meaning, then we can't use the theory for prediction or anything useful?

Stephen Tashi said:
You haven't given an argument. You've merely expressed a belief. I'd express the belief this way: "I believe that I live in a world where the observed frequency of high probability events is high." I don't think that's a controversial belief. Note that It expresses the idea that is that there is some physical property of events in the world that is "probability" and that there is some some different property that is "actuality".

It may be a belief, but virtually everyone believes that that if we want an event, than a high probability is favoured over an event with low probability, this belief must be beacuse we believe that probability theory helps us?, and that it is able to predict something?


It doesn't have to be the 1 that comes from the SLLN, but take another situation where we have probability very close to 1.
Let's say you have to take a bet with me, where I draw 1 000 000 standard-normal, random, independent variables(assumeing we can do this). And I count all the instances where variable is in the interval [-1,1], call this number x.
And I tell you that you can bet that x is either in A = {670 000, 670 001,...690 000}
or x is in B = {0,1...,1000 000}\A.
Probability theory gives that the event of x beeing in A is very close to 1 to occur. But this number that the probability measure gives us which is close to 1, why is it that we trust it so much that we are willing to make a bet on A?(assuming you have to make the bet).

I mean, in order to trust probability theory, and choose A, must we not assume then that the standard normal distrubution in a sense gives a measure of "how likely" that the variable is between -1 and 1. And that since the axioms says that max probability is 1, every probability is non-zero, and that if we have measures of probaility of events that are disjoint, then we can sum their probabilities together, so that when we get the final probability which is close to 1, it is also logical to say that this event is very likely to happen, because it must contain simple outcomes, that together we think is very likely to happen.

I mean is this not the reason why we "trust" the number 1 or a number close to 1 when it occurs in the probability-theory? If not, could you please explain why you wold bet on A in the case above?(assuming you had to make the bet)
 
  • #43
Stephen Tashi said:
Consider how the probability of an event changes. If a prize is placed "at random" behind one of 3 doors, the probability of it being behind the second door is 1/3. If we open the first door and the prize is not there then the probability of it being behind the second door changes to 1/2.

A bit off the topic, but haven't you described the Monty Hall problem? In which case, the probability of finding the prize behind a door is not the same between the two doors.
 
  • #44
bobby2k said:
It may be a belief, but virtually everyone believes that that if we want an event, than a high probability is favoured over an event with low probability, this belief must be beacuse we believe that probability theory helps us?, and that it is able to predict something?

I don't know what you are trying to accomplish. Understanding something intuitively is a personal matter. You want to find some words that give you the feeling that you understand the relation between probability and observed frequency. You are comfortable using words like "favored", "likely", "predict". All I can say is "suit yourself". Things must be precisely defined in order to do mathematics. How people bet is a matter of human behavior, not a matter of mathematics.
 
  • #45
austinuni said:
A bit off the topic, but haven't you described the Monty Hall problem? In which case, the probability of finding the prize behind a door is not the same between the two doors.

He did not. Because "we" opened the door. Monty Hall relies on Monty knowing where the prize is, and Monty using that knowledge when he opens the door.
 
  • #46
Stephen Tashi said:
I don't know what you are trying to accomplish. Understanding something intuitively is a personal matter. You want to find some words that give you the feeling that you understand the relation between probability and observed frequency. You are comfortable using words like "favored", "likely", "predict". All I can say is "suit yourself". Things must be precisely defined in order to do mathematics. How people bet is a matter of human behavior, not a matter of mathematics.
Yeah, in a way that is what I want, but it is not only personal I think, I mean there is a huge market in insurance where they use probability theory for "betting", these actuaries use probability theory, but it seems like everyone is assuming something about the real world, that justifies them in using probability theory.(I am not saying that they are assuming something about probability, but maybe something about the real world, that justifies them in using probability theory.)

I appreciate your patience, I've been thinking very much about this the last couple of days, and I think I have an explanation that satisfies me(this may be personal as you say), but can you please see if it is an acceptable explanation in order to apply probability theory? I will not make any definitions or assumptions about probability theory, only about the world in which we use probability theory.

Compared for instance to mechanics there is hookes law. Hookes law is defined mathematically, and there is nothing that makes that guarantees the mathematics of it to apply to the real world. But engineers use the law, because many stress-strain experiments seem to be in accordence with hookes law. So they in the model, they assume that the model works. This is the kind of understanding I want with probability theory, what is defined in mathematics, and what has to be assumed in the real world in order to use the theory.

So I get now that it is very wrong to say that probability is relative frequency, and it is also wrong to say that we can model probability as relative frequency. But maybe what we do when we apply probability theory is the opposite? That is, we model the physical relative frequency as probability. Then we are not making any assumptions about the mathematical world, but only about the real world, and we assume that some relative frequencies are stable, and we model them as probability?
Earlier you mentioned Von Mises Collectives, but this is not what I mean when we model relative frequency as probability, because as I understood from what I read about collectives, they have to converge. But in the real world relative frequencies may converge allmost all of the time, but theoretically they can diverge(like the coin-tossing, there exists as you said infinite sequences that divirged).

So basically probability is only in the mathematical world like hookes law is in mechanis. But if we do experiments in our real world, we see that relative frequencies seem to converge almost all of the time. Like the coin-tossing experiment, when we do this experiment in the real world it converges, however we know theoretically that there exists infinite sequences that physically could exist and diverge.
However since it is plausible that relative frequencies in the real world behave like probability, we model them as probability. Like hookes law in mechanics, we have no way of showing that this is correct, but experiments seem to suggest it is correct. Also that SLLN says that almost everytime a relative frequency will converge "with probability 1", the theory also takes into account that we may have divergence.

In conclusion the point was that we do not say that probability is anything other than it is defined to be, but we model the real world relative frequencies as probability, and it is seems like a good idea because some experiments in the real world seem to have convergence, but theoretically we know they can diverge, and probability theory says that with probability 1 we will have convergence, but it also take into account that we may have divergence at a set with measure 0. Impirival evidence shows that it is plausible that relative frequency can be modeled as probability?

Is this an acceptable personal "mental bridge" from the mathematical theory of probability, to the applications in the real world regarding relative frequencies? Without the empirical evidence of stable relative frequencies, alot(not all probably) of the usage of probability theory would not be possible?

I got the book written by the man who invented the axioms. He says a little about this, it is under a chapter on how to apply probability. But he doesn't say clearly that he models the relative frequency as probability, but maybe that is what he means?
http://postimg.org/image/ge5tlgazf/

I wish you a merry christmas!
 
Last edited by a moderator:
  • #47
bobby2k said:
experiments in the real world seem to have convergence, but theoretically we know they can diverge, and probability theory says that with probability 1 we will have convergence, but it also take into account that we may have divergence at a set with measure 0. Impirival evidence shows that it is plausible that relative frequency can be modeled as probability?

You are describing one particular type of experiment, something like repeated tosses of coin. Do you think such experiments are commonly done in the real world? (For example, pricing life insurance based on probability theory is a more complicated scenario than coin tossing.) I think you are dealing with "thought experiments". I agree that in such a thought experiment most people who apply probability theory think that they live in world where the observed frequency will (definitely) converge to the underlying probability of the outcome as the number of trials becomes large. However, I think the empirical experience that scientists have in using probability theory is from more complicated experiments.
 
  • #48
Stephen Tashi said:
You are describing one particular type of experiment, something like repeated tosses of coin. Do you think such experiments are commonly done in the real world? (For example, pricing life insurance based on probability theory is a more complicated scenario than coin tossing.) I think you are dealing with "thought experiments". I agree that in such a thought experiment most people who apply probability theory think that they live in world where the observed frequency will (definitely) converge to the underlying probability of the outcome as the number of trials becomes large. However, I think the empirical experience that scientists have in using probability theory is from more complicated experiments.

Yeah, I see your point, when the experiments aren't repeatable I see that the situation is more comlicated, I won't bother too much with that for now, but I'll be sure try and undertand situations, and experiments like this better in the future.

But what I was most curious about knowing was what you mentioned:
I agree that in such a thought experiment most people who apply probability theory think that they live in world where the observed frequency will (definitely) converge to the underlying probability of the outcome as the number of trials becomes large.
Now what I've been trying to find out all along is why "most people" is justified in doing this. There are two ways I see in justifing this, the first one that we discussed that is wrong, is saying that "if we assume probability is relative frequency, then the theory works", this was wrong because proability is defined precisely as a measure, and if we give the measure another property like relative frequency, that creates mathematical difficulties.
The other way of justifying what you mentioned, is that if we have a repateable experiment, then we model the relative frequencies in the "real world" as the abstract probabilities in the mathematical world. Do you agree that this is the assumption about the real world that allows people to say what you said?(without the word defintely, but instead almost definitely). And then this model also takes into account that we do not definitely have to have convergence, there is possibilities in the real world that we do not have convergence, as well as in the mathematical world.
 
Last edited:
  • #49
bobby2k said:
I have taken a course in probability and statistics, and did well, but still I feel that I do not grasp the core of what holds the theory together. It is a little weird that I should use a lot of theory when I do not get the simple building block of the theory.

I am basically wondering if probability is defined in some way?

In the statistics books I have looked in, probability is not defined, but at the beginning of the book, they give a describtion of how we can look at probability, and this is usually the relative frequency model, but they never define it to be this?

These steps is what I seem to see in a statistics books, do they seem fair?

1. Probability is described in terms of events, outcomes and relative frequency, but never defined.
2. A lot of theory is then built regarding probability.
3. Then with the help of Chebychevs inequality, we are able to show that the relative frequency model is correct. That is, if the probability for an event is p, and X is a bernoulli random variable, then mean(X) will converge to p.

Do you see my problem? If we say that the probability for an event is p, then we can show that the relative frequency of the of the event in the long run is p. In order to show this, we used all the theory of linear combinations, variance etc.. But this means that the relative frequency model is a consequence of our theory, correct?

I mean, we can not say that the probability is the relative frequency, then develeop a lot of theory, and then prove that p equals the relative frequency, then we are going in a circle?

You are right. The relative frequency thing doesn't work once all the time, particularly when working with infinite sets.

Real probability is based on measure theory and Kolmogorov's axioms. It isn't that complicated, but it is too hard to explain in an Internet post today.
 
  • #50
bobby2k said:
Now what I've been trying to find out all along is why "most people" is justified in doing this.

They aren't justified in any axiomatic sense.
 
  • #51
Stephen Tashi said:
They aren't justified in any axiomatic sense.

Even if your remove the word "definitely" with "almost surely" or "almost definitely"?, and say that we model the relative frequencies as probabilities?
What is then your comment to what Mr. Kolmogorov wrote?

http://postimg.org/image/ge5tlgazf/
 
Last edited by a moderator:
  • #52
bobby2k said:
What is then your comment to what Mr. Kolmogorov wrote?

http://postimg.org/image/ge5tlgazf/

Being "practically certain" expresses a belief.
 
Last edited by a moderator:
  • #53
bobby2k said:
I have taken a course in probability and statistics, and did well, but still I feel that I do not grasp the core of what holds the theory together. It is a little weird that I should use a lot of theory when I do not get the simple building block of the theory.

I am basically wondering if probability is defined in some way?

In the statistics books I have looked in, probability is not defined, but at the beginning of the book, they give a describtion of how we can look at probability, and this is usually the relative frequency model, but they never define it to be this?

These steps is what I seem to see in a statistics books, do they seem fair?

1. Probability is described in terms of events, outcomes and relative frequency, but never defined.

It is sad. the book is about something never defined.

I find this Kolmogorov's probability theory.

http://en.wikipedia.org/wiki/Probability_axioms
 
  • #54
Stephen Tashi said:
Being "practically certain" expresses a belief.

I think I am starting to understand what you mean.

I also see now that the statement "we model real world relative frequencies as probabilities" likely is wrong.
When I first justified this in my head I thought that saying that since almost sure convergence says that we have convergence most of the time, but leaves room for divergence it seemed ok to use the realtive frequency interpretation on this, however it seems very sketchy to use the rel. freq. int. on the probability from SLLN but not on the original probability, and offcourse we can't use it on the original in order to leave room for divergence.So the model only says that it is probable(probability 1) that the relative frequency will converge to the probability, any attempts to say that probability is relative frequency(Von Mises), or say that relative frequency is probability(what I tried earlier) will fail one way or the other?

I guess the correct way to be very precise in using probability theory on repeatable events is saying that we assign probabilities to an event if we view it highly probable that the relative frequency will will converge to this probability(number). And if a calculated probability for an event is p, the theory says that with probability 1 the relative frequency of this event will converge(independent trials etc.) to p.
And the reason we as humans make decisions based on probability theory is because we accept that our own perception of probability agrees with the axioms and definition of independence, and hence the mathematical description of probability? So if a calculated probability is 1, we view it as probable in our own perception of probability aswell, because we as humans agree with the axioms?
 
  • #55
bobby2k said:
I guess the correct way to be very precise in using probability theory on repeatable events is saying that we assign probabilities to an event if we view it highly probable that the relative frequency will will converge to this probability(number).


Yes, theorems in probability theory deal with the probability of things happening. When an actual event or an observed frequency is mentioned, the subject is the probability of such a thing.

And if a calculated probability for an event is p, the theory says that with probability 1 the relative frequency of this event will converge(independent trials etc.) to p.

The technical details of "will converge" involve statements about probability .

So if a calculated probability is 1, we view it as probable in our own perception of probability as well, because we as humans agree with the axioms?

I think most humans expect an event with a high probability to actually happen without consulting any axioms. It's an somewhat circular psychology. We don't accept theories that assign high probabilities to events that we don't expect to happen.

If you want to think about this subject in a coherent manner, you must understand and pay attention to the role of definitions. A definition of a concept must be expressed in terms of other concepts. Because of this, mathematics must begin with undefined concepts. (The alternative would be to get into circular definitions - C1 is defined using C2, C2 is defined using C3, C3 is defined using C1 or to get into infinite regressions - C1 is defined using C2, C2 is defined using C3, C3 is defined using C4,...etc.) The standard approach to probability is to define it in terms of a "measure" and if you trace this back to basic undefined concepts, you reach the same undefined concepts that are used to define the concepts of length, area, volume.

It's a natural human desire to seek a basis for probability theory that would employ concepts such as "the tendency for something to actually happen". This would face the formidable task of dealing with the concepts of "actuality" and "tendency". It would get into semantic tangles such as whether the "tendency for something to happen" is "actual" and whether there can be a "tendency of tendency" etc. If someone has attempted to axiomatize probability theory this way, the results aren't widely known.

In the formal mathematical development of a topic, the undefined concepts are not taken as "obvious" or "understood". You can't assert a property of an undefined concept based only on your intuitive interpretation of the concept. Any properties of the undefined concepts must be stated explicitly as assumptions. A person may feel confident that they can answer any questions that arise about "tendencies" and "actualities", but this competence would not constitute a mathematical theory. To have a mathematical theory, they must declare in advance what set of assumptions they were using to answer the questions.

I'm not saying that it would be impossible to axiomatize probability theory using a set of undefined concepts that are pleasing to our intuitive idea of "tendency" and "actuality". However, I think doing this would be very difficult. (Philosophical discussions of "the potential" and "the possible" vs "the actual" go back to Aristotle. Mathematical treatment would be a different matter.)
 
  • Like
Likes 1 person
  • #56
Thanks, I have truly learned a lot from you in this thread. I can't believe that I once thought that the mathematical law of large numbers was some kind of guarantee for real world events.

I am sorry for asking a lot of questions, I think this will be the last one, and if you can confirm it, I think I for now have an adequate understanding about the relationship about probability and observable real world relative frequencies. My question is if the two below statements provide a correct way of thinking about the real-world relative frequencies when using probability-theory? It would be nice if you could confirm it.

1. Let's say we happen to have a real world repeatable and indpendent experiment(let's assume we can). And observe the event A in this experiment. If we choose to use probability-theory to analyze the experiment, the theory says that it is probable that the relative-frequencies will converge to the probability. So even though we have no way of knowing if what we are observing is convering, or even if it was converging we wouldn't have control over the episolon, we still approximate the probability with the relative frequency since the theory says it is probable that the relative-frequency would converge.

2. Conversely, if we for some reason have a repeatable experiment, which contain A, and are given the probability of A which we call p. We can not say for sure what the relative frequency of N(A)/N will be. But the theory says that it is probable that this relative frequency will converge to the probability. So if we observe the experiment and it seems that the relative frequencies converge to something else than the given p, we can not say for sure that it was wrong to say that the probability for A was p. But when using probability theory we can say that it is probable that it was not p, because if it was p it is probable that the relative frequency would converge to p.(Again here I have not taken into account that even if we had convergence we would not have control about the epsilon).

If statistics-books had used this formulation instead of formulations like "the relative frequency will converge to the probability", would you then agree with the books?
 
Last edited:
  • #57
bobby2k said:
Thanks, I have truly learned a lot from you in this thread. I can't believe that I once thought that the mathematical law of large numbers was some kind of guarantee for real world events.

I am sorry for asking a lot of questions, I think this will be the last one, and if you can confirm it, I think I for now have an adequate understanding about the relationship about probability and observable real world relative frequencies. My question is if the two below statements provide a correct way of thinking about the real-world relative frequencies when using probability-theory? It would be nice if you could confirm it.

1. Let's say we happen to have a real world repeatable and indpendent experiment(let's assume we can). And observe the event A in this experiment. If we choose to use probability-theory to analyze the experiment, the theory says that it is probable that the relative-frequencies will converge to the probability. So even though we have no way of knowing if what we are observing is convering, or even if it was converging we wouldn't have control over the episolon, we still approximate the probability with the relative frequency since the theory says it is probable that the relative-frequency would converge.

2. Conversely, if we for some reason have a repeatable experiment, which contain A, and are given the probability of A which we call p. We can not say for sure what the relative frequency of N(A)/N will be. But the theory says that it is probable that this relative frequency will converge to the probability. So if we observe the experiment and it seems that the relative frequencies converge to something else than the given p, we can not say for sure that it was wrong to say that the probability for A was p. But when using probability theory we can say that it is probable that it was not p, because if it was p it is probable that the relative frequency would converge to p.(Again here I have not taken into account that even if we had convergence we would not have control about the epsilon).

If statistics-books had used this formulation instead of formulations like "the relative frequency will converge to the probability", would you then agree with the books?


This is correct.
 
  • Like
Likes 1 person
  • #58
bobby2k said:
If statistics-books had used this formulation instead of formulations like "the relative frequency will converge to the probability", would you then agree with the books?

I agree with the general idea of what you expressed. I'd prefer to see it written in a way that that makes it clear that we act on belief. When you say " since the theory says it is probable that the relative-frequency would converge." you should make it clear that "since" is not used to mean that there is a mathematical deduction involved. (i.e. It isn't like saying x > 2 "since" x - 2 > 0 )
 
  • Like
Likes 1 person
  • #59
Stephen Tashi said:
I agree with the general idea of what you expressed. I'd prefer to see it written in a way that that makes it clear that we act on belief. When you say " since the theory says it is probable that the relative-frequency would converge." you should make it clear that "since" is not used to mean that there is a mathematical deduction involved. (i.e. It isn't like saying x > 2 "since" x - 2 > 0 )

Ah, very good catch. You are very good at distinguishing what the mathematics is precisely and what is not, even though I have gotten better at this, I still tend to mix in this subject.

Would you say that I could express what I wanted there by changing the sentence, so I only dealt with mathematical terms of the theory? Or is it inevitable that we would use "belief" in explaining/justifying that we approximate a probability with a finite relative frequency?

Also, thank you very much Hornbein for taking the time to read what I wrote!
 
Last edited:
  • #60
There is a mathematically rigorous definition of "probability measure" that does not depend on any concepts of statistics or "frequency of occurrence". A probability measure is a non-negative function of all subsets of a set that is additive for disjoint subsets and the value for the entire set is 1. All the general properties of the probability function can be derived from that basic definition. (For instance, see "A Course in Probability Theory" by Kai Lai Chung). This is more for the pure mathematician than for an applied statistician.
 
  • Like
Likes 1 person
  • #61
bobby2k said:
Or is it inevitable that we would use "belief" in explaining/justifying that we approximate a probability with a finite relative frequency?

If you have a mathematical theory that deals with one set of concepts ( e.g. weight, mass, position) and you try to apply it to a situation defined by a different set of concepts (e.g. price, value, utility) then you must introduce assumptions that establish some relation between the different concepts. You can introduce assumptions as formal mathematical axioms (which is necessary if you intend to prove your results) or you can introduce assumptions by your personal beliefs in an informal manner.

As far as I know, nobody has created a set of axioms for probability theory that establishes any deterministic relation between the relative frequency of an event and its probability. So if you want to establish such a relationship, you must do it using your personal beliefs - or else invent the mathematics that does the job.
 
  • Like
Likes 1 person
  • #62
There is nothing mystical about probabilities. Given a set of data, it is obvious, and dirt simple, to ask which fraction or percentage satisfies certain conditions. Scaling those numbers to a [0,1] scale or to a [0%, 100%] scale is just computationally convenient. Also, using the past events (statistics of past experiences) to anticipate similar future events (probabilities of future outcomes) is critical to the survival of even the simplest thinking animals.
 
  • #63
Stephen Tashi said:
If you have a mathematical theory that deals with one set of concepts ( e.g. weight, mass, position) and you try to apply it to a situation defined by a different set of concepts (e.g. price, value, utility) then you must introduce assumptions that establish some relation between the different concepts. You can introduce assumptions as formal mathematical axioms (which is necessary if you intend to prove your results) or you can introduce assumptions by your personal beliefs in an informal manner.

As far as I know, nobody has created a set of axioms for probability theory that establishes any deterministic relation between the relative frequency of an event and its probability. So if you want to establish such a relationship, you must do it using your personal beliefs - or else invent the mathematics that does the job.

Thank you, I now have the understanding I wanted about this subject. Sorry for taking so much time to understand it, but thank you very much for beeing patient!
 
  • #64
"On voit, par cet Essai, que la théorie des probabilités n'est, au fond, que le bon sens réduit au calcul; elle fait apprécier avec exactitude ce que les esprits justes sentent par une sorte d'instinct, sans qu'ils puissent souvent s'en rendre compte."

"One sees, from this Essay, that the theory of probabilities is basically just common sense reduced to calculus; it makes one appreciate with exactness that which accurate minds feel with a sort of instinct, often without being able to account for it."

Pierre-Simon Laplace, from the Introduction to Théorie Analytique des Probabilités.
 
Back
Top