Probabilities and random variables

In summary: Furthermore, you can do it recursively: from P(N=0) you can do some simple multiplications to get P(N=1). From P(N=1) it is a simple matter of some multiplications to get to P(N=2), etc. It really is NOT that complicated, especially if you think it through first.I agree with Ray, and:I would expect that the giving the full expression for the biomial distribution constitutes a valid answer for this exercise ?
  • #1
Mohamed BOUCHAKOUR
15
0

Homework Statement


In a given society, 15% of people have the sickness "Sa" , from them 20% have the sickness "Sb".
And from those that don't have the sickness "Sa", 5% have the sickness "Sb"
1-We randomly choose a person. and we define:
A:"the person having Sa"
B:"the person having Sb"

-Calculate: P(A) , P(B) and P(A∩B)

2-We take 10 persons from this society. We define X as the random variable that equals to the number of people having the sickness A and B at the same time.
-give the possible values of X, and the probability of them happening.

The first one is pretty easy, need some help with the second.

Homework Equations


Results of the first question:
P(A∩B)=3%
P(A)=15%
P(B)=725/10000

The Attempt at a Solution


X can take any value between 0 and 10.
I tried two things:
1- using the probability tree:
-P(X=0)=(1-P(A∩B))10≈0.73
-P(X=1)=10*P(A∩B)*(1-P(A∩B))9=0.22
But then it gets a bit too complicated to know how many times it's repeated.

2- We suppose that this society consists of 1000 people, so 30 of the will have both Sa and Sb:
-P(X=0)=C10970/C101000=0.73
-P(X=1)=(C130*C9970)/C101000=0.22
.
.
.
.

I think both are correct, but the first is way too complicated for the bigger ones.
And is there another method to avoid the supposition about the number of people in the society.
 
Physics news on Phys.org
  • #2
Mohamed BOUCHAKOUR said:

Homework Statement


In a given society, 15% of people have the sickness "Sa" , from them 20% have the sickness "Sb".
And from those that don't have the sickness "Sa", 5% have the sickness "Sb"
1-We randomly choose a person. and we define:
A:"the person having Sa"
B:"the person having Sb"

-Calculate: P(A) , P(B) and P(A∩B)

2-We take 10 persons from this society. We define X as the random variable that equals to the number of people having the sickness A and B at the same time.
-give the possible values of X, and the probability of them happening.

The first one is pretty easy, need some help with the second.

Homework Equations


Results of the first question:
P(A∩B)=3%
P(A)=15%
P(B)=725/10000

The Attempt at a Solution


X can take any value between 0 and 10.
I tried two things:
1- using the probability tree:
-P(X=0)=(1-P(A∩B))10≈0.73
-P(X=1)=10*P(A∩B)*(1-P(A∩B))9=0.22
But then it gets a bit too complicated to know how many times it's repeated.

2- We suppose that this society consists of 1000 people, so 30 of the will have both Sa and Sb:
-P(X=0)=C10970/C101000=0.73
-P(X=1)=(C130*C9970)/C101000=0.22
.
.
.
.

I think both are correct, but the first is way too complicated for the bigger ones.
And is there another method to avoid the supposition about the number of people in the society.

No, they are not both correct. In principle, the second way is likely more accurate, but the details depend on the exact size of the whole population. (Furthermore, if anything, it is more complicated to calculate than the first way.) However, we are saved by the fact that for large populations both ways give almost identical results. In other words, for large populations, the hypergeometric distribution (the second way) becomes essentially indistinguishable from the much simpler binomial distribution (the first way).

For the binomial case you can easily use a calculator (or a spreadsheet, or an on-line calculator) to give a complete table of probability values P(N=n) for n = 0,1,2, ... , 10. Furthermore, you can do it recursively: from P(N=0) you can do some simple multiplications to get P(N=1). From P(N=1) it is a simple matter of some multiplications to get to P(N=2), etc. It really is NOT that complicated, especially if you think it through first.
 
  • Like
Likes Mohamed BOUCHAKOUR
  • #3
I agree with Ray, and:

I would expect that the giving the full expression for the biomial distribution constitutes a valid answer for this exercise ?
 
  • #4
BvU said:
I agree with Ray, and:

I would expect that the giving the full expression for the biomial distribution constitutes a valid answer for this exercise ?

Probably not, since we didn't go over the Binomial Distribution Formula in class, it would be better to elaborate more.
 
  • #5
Ray Vickson said:
No, they are not both correct. In principle, the second way is likely more accurate, but the details depend on the exact size of the whole population. (Furthermore, if anything, it is more complicated to calculate than the first way.) However, we are saved by the fact that for large populations both ways give almost identical results. In other words, for large populations, the hypergeometric distribution (the second way) becomes essentially indistinguishable from the much simpler binomial distribution (the first way).
Didn't think about it this way, I'm just starting with probabilities, so I didn't know most of this, but you lead me to do some research in google, thanks for that :wink:

Ray Vickson said:
For the binomial case you can easily use a calculator (or a spreadsheet, or an on-line calculator) to give a complete table of probability values P(N=n) for n = 0,1,2, ... , 10. Furthermore, you can do it recursively: from P(N=0) you can do some simple multiplications to get P(N=1). From P(N=1) it is a simple matter of some multiplications to get to P(N=2), etc. It really is NOT that complicated, especially if you think it through first.

Since i didn't know the Binomial Distribution Formula (didn't go over it in class), the only think i had was probability tree.
And thanks to my stupidity, I didn't think of a way to find how many ways there is to choose in each case (except guessing) (this is why i said "complicated").
 
  • #6
Mohamed BOUCHAKOUR said:
Didn't think about it this way, I'm just starting with probabilities, so I didn't know most of this, but you lead me to do some research in google, thanks for that :wink:
Since i didn't know the Binomial Distribution Formula (didn't go over it in class), the only think i had was probability tree.
And thanks to my stupidity, I didn't think of a way to find how many ways there is to choose in each case (except guessing) (this is why i said "complicated").

You were starting out writing down the first two probabilities, but then gave up. You say it became "too complicated", but then went on to write formulas like ##C^{30}_1 C^{970}_9 /C^{1000}_{10},## etc., and that is way more complicated than what you would get in the first way. You obviously know about binomial coefficients ##C^n_m,## so you know about the needed tools.

Anyway, you should get in the habit of shortening what you write, by using sensible notation. For example, you could say "let ##p = P(A \cap B) = 0.03## and ##q = 1-p = 0.97##". Then ##P(N=0) = p^{10}, P(N=1) = 10\, p^9 q,## etc. Writing ##P(A \cap B)## over and over again really is a waste of time, and is also much harder to read.
 
Last edited:

FAQ: Probabilities and random variables

What is the difference between probability and random variables?

Probability refers to the likelihood of a certain event or outcome occurring. It is represented by a number between 0 and 1, where 0 indicates impossibility and 1 indicates certainty. On the other hand, a random variable is a numerical quantity that represents the outcomes of a random event. It can take on different values based on the probability of each outcome occurring.

What is the role of probability in statistics?

Probability is a fundamental concept in statistics, as it helps us understand the likelihood of different outcomes and make predictions based on data. It allows us to quantify uncertainty and make informed decisions based on the likelihood of certain events occurring.

How are probabilities and random variables used in real-world applications?

Probabilities and random variables are used in a wide range of fields such as finance, economics, engineering, and healthcare. For example, in finance, probabilities are used to calculate the risk associated with different investments and to make decisions on portfolio management. In healthcare, probabilities can be used to predict the likelihood of a certain disease occurring in a population. Random variables are used to model and analyze real-world phenomena, such as stock prices, weather patterns, and disease outbreaks.

What is the difference between discrete and continuous random variables?

Discrete random variables can only take on a finite or countably infinite number of values, while continuous random variables can take on any value within a certain range. For example, the number of heads obtained when flipping a coin is a discrete random variable, while the height of individuals in a population is a continuous random variable.

How do you calculate the expected value of a random variable?

The expected value of a random variable is the sum of each possible outcome multiplied by its associated probability. It represents the average value we would expect to obtain if the random variable is repeated many times. It is denoted by E[X] and is calculated using the formula: E[X] = Σ x * P(X=x), where x represents each possible outcome and P(X=x) is the probability of that outcome occurring.

Back
Top