Statistics probability questions

In summary, Stéphane has a 55% chance of a visiting scholar from Switzerland coming and burdening him with research questions, and a 45% chance of not having a visiting scholar that week. If Stéphane fails to write 4 exercises one week, there is a 70% chance he received a visiting scholar that week.
  • #1
Rifscape
41
0

Homework Statement



Each week, Stéphane needs to prepare 4 exercises for the following week's homework assignment. The number of problems he creates in a week follows a Poisson distribution with mean 6.9.

a. What is the probability that Stéphane manages to create enough exercises for the following week's homework? Round your answer to 4 decimal places.

b. Unfortunately, each week there is a 55% chance that a visiting scholar from Switzerland arrives and burdens Stéphane with research questions all week. During these weeks he only writes an average of 3.45 exercises. If Stéphane fails to write 4 exercises one week, what is the probably that he received a visiting scholar that week? Round your answer to 4 decimal places.

c. The last week of the semester, Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If a student with a 60% chance of correctly answering an exercise is expected to answer 3 questions correctly, what is the probably that Stéphane did not have a visitor that week? Round your answer to 4 decimal places.
Hint: First find the number of exercises in the last week of the semester from the chance and expected value of the correct answers.

Homework Equations


P(A and B) = P(A) * P(B)
P(A | B) = P(A and B)/p(B)
Poisson distribution equation:
P(x; μ) = (e-μ) (μx) / x!

The Attempt at a Solution



I was able to finish the first question and get the right answer but I'm having trouble on parts b and c.
For the first question:
P(x = 4) = 1 - P(x <= 3) = 1 - (P(0) + P(1) + P(2) + P(3))

Then I used the poisson distribution equation and was able to get the answer.

I think for b you need to use the conditional equation, but I'm not sure what the P(failing) would be.

I have no idea how to do c

Any help is appreciated, thanks for reading
 
Physics news on Phys.org
  • #2
The formatting here is kind of hard to read... I think your Poisson distribution equation is wrong though. If you are using parameter ##\mu ## instead of the perhaps more typical ##\lambda##, the poisson pmf is given by:

##P(x, \mu) = \frac{\mu^x e^{-\mu}}{x!}##

I'd focus first on part b. Draw a tree with 55% chance of having a situation with ##\mu := 3.45## and a 45% chance of ## \mu = 6.9##. Trees are visually powerful, and quite useful in computation settings -- so give it a shot and draw this tree. (Historical note: drawing a tree is also how Pascal solved the 'original' probability problem -- the problem of points.) For each of the two leaves of this simple tree, what is that probability of having not done at least 4 exercises? I.e. what is ##\big(P(x=0, \mu) + P(x=1, \mu) + P(x=2, \mu) + P(x=3, \mu)\big)## for both leaves of this tree? From here you just need to get a normalizing constant so that your 'posterior' sums to one. Are you familiar with this?

Part C flips the conditioning around -- ala Bayesian inference, but I'd focus on solving b first.
- - - -

btw for part A), your equation says "P(x = 4) " but it really should read ##P(x \geq 4)##
 
  • #3
Alright I did that, and was able to get the P(failing) since P(failing) = P(failing|scholar) + P(failing|no scholar)

Alright I was able to get .88 for the second question, though I don't know how to do the last part.
 
  • #4
Rifscape said:
Alright I did that, and was able to get the P(failing) since P(failing) = P(failing|scholar) + P(failing|no scholar)

Alright I was able to get .88 for the second question, though I don't know how to do the last part.
No, that is not correct; the correct result is
P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).
 
  • #5
Wait yeah you're right, I did that, just forgot to write it down. You can only get 0.88 if you do P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Do you know how to start part c?

Thanks for the help
 
  • #6
Shouldn't (c) be something like ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##

via law of total expectation and linearity of expectations? All you are doing is solving for p, there. Again, drawing a picture first -- i.e. a tree-- should get you there.

I really can't emphasize this enough -- if at all possible, try drawing picture to help solve your problems in probability. You should pretty much always be able to do this when you are working with a countable number of states...
 
Last edited:
  • #7
Rifscape said:
Wait yeah you're right, I did that, just forgot to write it down. You can only get 0.88 if you do P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Do you know how to start part c?

Thanks for the help

I don't get 0.88 for part (b).
 
  • #8
Yeah I drew a tree for part b and it made sense and I was able to get it, but I have been unable for this question. I attempted to draw one based on the formula you gave, but I don' think its fully correct.

When I plugged in 6.9 for E[X1] and 3.45 for E[X2], I solved for p and got p= 0.4492, which is not correct. Though I think I am on the right track.
 
  • #9
Ray Vickson said:
I don't get 0.88 for part (b).

I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).
 
  • #10
Ray Vickson said:
I don't get 0.88 for part (b).

Rifscape said:
I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Setting aside the nit that the answer wants 4 decimals, 0.88 checks out and is correct.
 
  • #11
Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution
 
  • #12
Rifscape said:
Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution
To be clear ##X_1## and ##X_2## are random variables. But HIGH LEVEL (see bold at end) yes, you might interpret it as ##E[X_1] = \mu_1## and ##E[X_2] = \mu_2##. Unfortunately a lot of probability problems turn into linguistic ones where meanings easily get trampled on...

The issue I have from C is the question has linguistic issues.

Your problem begins by saying

"Each week, Stéphane needs to prepare 4 exercises for the following week's homework assignment."

Then part c says:

The last week of the semester, Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises

I have two interpretations here. One the question is busted -- the idea of Stéphane preparing questions for the following week during the last week simply doesn't make any sense, as those questions would be issued after the semester end. The alternative interpretation is -- loosely speaking-- that we have a renewal here -- so basically you have a Poisson with a cap at 4 from prior week, and then another unbounded Poisson that happens during the last week for issuing more exercises. So now you'd draw a tree with a root, and two levels in it. That's my current thinking at least -- the wording has room for improvement.
 
  • #13
Yeah, my professor always has these wording issues. I think its probably the second option, where there is a renewal and that there is a new cap.
 
  • #14
Rifscape said:
I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

You are correct; I was calculating the wrong thing.
 
  • #15
Rifscape said:
Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution

The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.
 
  • #16
Ray Vickson said:
The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.

How would you get the E[X1] and E[X2] then? I'm still not really sure how I would tackle this question, I drew a tree, but there seems to be many parts to this question.
 
  • #17
Ray Vickson said:
The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.

@ Ray Vickson
OP was responding to my equation from earlier where I said: ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##
Rifscape said:
How would you get the E[X1] and E[X2] then? I'm still not really sure how I would tackle this question, I drew a tree, but there seems to be many parts to this question.

@ Rifscape
I would still recommend using my above setup. I don't understand why Ray Vickson would make such a comment.

If you want to formalize this and abstract even further --- note: the following is unnecessary-- you can use the law of iterated expectations. Consider some random variable ##Y## which denotes the number of questions answered correctly by the student. Consider ##N##, a natural number denominated r.v. that gives the number of questions actually offered.

Consider Q, a bernouli r.v. that has probability 0.6 of a correct answer by said student for a given question and is independent of ##N##.

The question states ##E[Y] = 3##. Using law of iterated expectations we can rewrite this as

##E[Y] = 3##
## E[Y] = E[E[Y|N]] = 3 ##
## E[Y] = E[N E[Q]] = 3 ##
## E[Y] = E[N] E[Q] = 3##
## E[Y] = E[N] (0.6) = 3##
## E[Y] = 0.6 E[N] = 3## When you consider that ##E[N] = (p*E[X_1] + (1-p)*E[X_2])##, you then get the equation I originally supplied. I am not supposed to supply the whole answer, so I leave finding ##E[X_1]## and ##E[X_2]## up to you, though maybe I can help if you have some follow-up questions where I won't have to give away the whole thing.
 
Last edited:
  • #18
I'm still kind of stuck on how to find E[X1] and E[X2], would I use the poisson distribution equation and find the probability that x >=3 using 6.9 for X1 and 3.45 for X2?
 
  • #19
Rifscape said:
I'm still kind of stuck on how to find E[X1] and E[X2], would I use the poisson distribution equation and find the probability that x >=3 using 6.9 for X1 and 3.45 for X2?

In the original problem statement it said "The number of problems he creates in a week follows a Poisson distribution with mean 6.9" before part (a) and said "During these weeks he only writes an average of 3.45 exercises" in part (b). What do you think those statements signify?
 
  • #20
Ray Vickson said:
In the original problem statement it said "The number of problems he creates in a week follows a Poisson distribution with mean 6.9" before part (a) and said "During these weeks he only writes an average of 3.45 exercises" in part (b). What do you think those statements signify?
Wouldn't those be the expected values, for x1 and x2, since it's a poisson distribution? But I already tried those values and they didn't work.
 
  • #21
Rifscape said:
Wouldn't those be the expected values, for x1 and x2, since it's a poisson distribution? But I already tried those values and they didn't work.

They should have worked. When I solved for ##p## I get a value different from your 0.4492.
 
  • #22
StoneTemplePython said:
@ Ray Vickson
OP was responding to my equation from earlier where I said: ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##
I got x1 as 5 and x2 as 3, but when I plug them in I get?
@ Rifscape
I would still recommend using my above setup. I don't understand why Ray Vickson would make such a comment.

If you want to formalize this and abstract even further --- note: the following is unnecessary-- you can use the law of iterated expectations. Consider some random variable ##Y## which denotes the number of questions answered correctly by the student. Consider ##N##, a natural number denominated r.v. that gives the number of questions actually offered.

Consider Q, a bernouli r.v. that has probability 0.6 of a correct answer by said student for a given question and is independent of ##N##.

The question states ##E[Y] = 3##. Using law of iterated expectations we can rewrite this as

##E[Y] = 3##
## E[Y] = E[E[Y|N]] = 3 ##
## E[Y] = E[N E[Q]] = 3 ##
## E[Y] = E[N] E[Q] = 3##
## E[Y] = E[N] (0.6) = 3##
## E[Y] = 0.6 E[N] = 3##When you consider that ##E[N] = (p*E[X_1] + (1-p)*E[X_2])##, you then get the equation I originally supplied. I am not supposed to supply the whole answer, so I leave finding ##E[X_1]## and ##E[X_2]## up to you, though maybe I can help if you have some follow-up questions where I won't have to give away the whole thing.
 
  • #23
Ray Vickson said:
They should have worked. When I solved for ##p## I get a value different from your 0.4492.
Hmm really what value ? I keep getting 0.4492, is it the same equation that python gave?
 
  • #24
Ray Vickson said:
They should have worked. When I solved for ##p## I get a value different from your 0.4492.
Isn't it (3/0.6 - 3.45) /3.45, which equal. 4492
 
  • #25
Rifscape said:
Hmm really what value ? I keep getting 0.4492, is it the same equation that python gave?

Yes, but with the 0.6 factor included, and using the correct choices of ##EX_1## and ##EX_2##.
 
  • #26
Ray Vickson said:
Yes, but with the 0.6 factor included, and using the correct choices of ##EX_1## and ##EX_2##.
Huh, I'm using that too along with 6.9 and 3.45, but I can't get that answer
 
  • #27
Is the number you got 0.4539?
 
  • #28
Rifscape said:
Is the number you got 0.4539?

My p is (1-your p), so the answers agree when expressed in words.

Sorry for the confusion: I have been suffering from a cold that makes my head fuzzy, and so have been a bit mixed up.
 
  • #29
@Rifscape

To be honest, part (c) is probably one of the worst worded questions that I've seen. I hope you have a higher quality text to study from or is reviewing something like MIT's 6.041 (https://ocw.mit.edu/courses/electri...s-analysis-and-applied-probability-fall-2010/ ) or Harvard's intro to probability (see Joe Blitzstein on youtube). Otherwise there's a risk of learning the opposite of clear thinking from this course.

First criticism: see prior page where I pointed out that the problem is illogical if questions are during a given week for use in the following week, if we are talking about Stephane coming up with questions during the last week of the course. Based on results, it seems that the renewal idea is out the door, so my criticism that the question is busted is sustained. The fix would be to say that Stephane makes the decision to 'reward' students at the beginning of the second to last week of the course (so that the questions can be issued during the last week).

Second, and new criticism. The answer being sought violates the law of iterated expectations. Put differently the question asks one thing but wants an answer for a different problem. Let me simplify, and consider a simpler but probabilistically identical problem where instead of a student with a 60% chance of answering an exercise is considered, consider the case where top student with 100% chance of answering an exercise correctly is expected to answer 5 questions correctly.

The equation your professor wants is, have a prior distribution = ##\left[\begin{matrix}0.55\\0.45\end{matrix}\right]## and then use

##posterior \propto diag(likelihood) \left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

and your prof wants you to assume that 5 questions were issued -- and use that for your likelihood function.

##posterior \propto
\left[\begin{matrix}p(5, \mu= 3.45) & 0\\0 & p(5, \mu= 6.9)\end{matrix}\right]
\left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

##posterior \propto
\left[\begin{matrix}
0.12929992
& 0\\0 &
0.13135067
\end{matrix}\right] \left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

if your normalize the above (i.e. make sure the resulting vector sums to one), you get

##posterior =
\left[\begin{matrix}0.546102377853564\\0.453897622146436\end{matrix}\right]
##

- - - -
The issue is this is not what the question says. If the above is supposed to be the correct answer, then it should say,

Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If the assignment is given out and it has 5 questions on it, what is the probably that Stéphane did not have a visitor that week?

or, more cumbersomely, it could say:
Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. The assignment is given out, and it has a certain number of questions on it, that is not a random variable. If a student with a 100% chance of correctly answering an exercise is expected to answer 5 questions correctly, what is the probably that Stéphane did not have a visitor that week?

but what it actually says (using my top student setup) is:
Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If a student with a 100% chance of correctly answering an exercise is expected to answer 5 questions correctly, what is the probably that Stéphane did not have a visitor that week?

In this setup, the random variable is actually the number of questions on the test, not the probability of having a visitor -- the probability of a visitor is just a parameter we are estimating. Despite what the hint suggests, you simply cannot make the jump to there being a deterministic 5 exercises. The question must specify that the student already received the x number questions or in some other manner make it explicit that the number of questions is now fixed. As it currently reads, there is no reason to believe that the student has seen the questions (or that Stephane is done working on them). Instead what we in effect learn is that a bookie tells us, based on all publicly known information, a fair bet is that the top student will answer 5 questions correctly.

So the statement we actually get is far more general, and different, than being told that the top student has seen the homework assignment and that the top student expects to get 5 correct, and hence that there are only 5 questions being assigned (i.e. the number of questions being assigned is no longer a random variable).

The issue is that:

##posterior^T \left[\begin{matrix}3.45 \\ 6.9 \end{matrix}\right] =
\left[\begin{matrix}0.546102377853564\\0.453897622146436\end{matrix}\right]
^T \left[\begin{matrix}3.45 \\ 6.9 \end{matrix}\right] =
5.0159467964052062 \neq 5 ##

Hence the solution vector violates the law of iterated expectations, unless the student has seen the number of homework questions and determined it equals 5. The fundamental problem is that instead of telling you the number of questions assigned, the question writer (I believe, your prof) tried to get clever and give you an expected value statement that was not carefully worded. You may want to ask your prof about the law of iterated expectations, why the student's conditional expectation of questions being issued is not a random variable, and hence why said law of iterated expectations does not apply here. I would anticipate a fuzzy answer.
 
  • #30
StoneTemplePython said:
@Rifscape

To be honest, part (c) is probably one of the worst worded questions that I've seen. I hope you have a higher quality text to study from or is reviewing something like MIT's 6.041 (https://ocw.mit.edu/courses/electri...s-analysis-and-applied-probability-fall-2010/ ) or Harvard's intro to probability (see Joe Blitzstein on youtube). Otherwise there's a risk of learning the opposite of clear thinking from this course.

Let me simplify, and consider a simpler but probabilistically identical problem where instead of a student with a 60% chance of answering an exercise is considered, consider the case where top student with 100% chance of answering an exercise correctly is expected to answer 5 questions correctly.

It makes perfectly good sense to ask for the posterior probability of a visiting scholar, given that a student answered 3 questions correctly and without making any assumption that the test has 5 questions. Admittedly the problem is more challenging than the "assume 5 questions" version, and may possibly be beyond the ability/knowledge of the OP, but I really do not know what the instructor intended, so having two possible versions cannot hurt. Below, let ##S## be the event "scholar week" and ##\bar{S}## the event "no-scholar week".

First of all, if 3 questions are answered correctly the test must contain ##N \geq 3## questions. For event ##\bar{S}## we have ##N \sim \text{Poisson}(\alpha)## with ##\alpha = 6.9,## while for event ##S## we have ##N \sim \text{Poisson}(\beta),## with ##\beta = 3.45##. Given an ##N = n \geq 3## the probability the student answers 3 questions correctly is the Binomial probability ##C(n,3) p^3 q^{n-3}, ## where ##p = 0.6, q = 0.4##. So, if ##Y## is the number of correctly-answered questions, then for event ##\bar{S}## we have ##P_{\bar{S}}(Y=3\: \&\: N=n) = C(n,3) p^3 q^{n-3} \alpha^n e^{-\alpha}/n!,## which simplifies to
$$P_{\bar{S}}(Y=3\: \&\: N=n) = \frac{p^3 \alpha^3}{3!} \frac{e^{-\alpha} (\alpha q)^{n-3}}{(n-3)!}.$$
Thus,
$$P(Y = 3|\bar{S}) = \sum_{n=3}^{\infty} P_{\bar{S}}(Y = 3\: \& \:N = n) = \frac{(\alpha p)^3}{3!} e^{-\alpha + \alpha q} = \frac{(\alpha p)^3 e^{-\alpha p}}{3!}.$$
Of course, this is Poisson probability with mean ##\alpha p = 0.6 \times 6.9 = 4.14.## Similarly, for event ##S##, ##Y## is Poisson with mean ##p \beta = 0.6 \times 3.45 = 2.07##. We have ##P(Y=3|\bar{S}) = 4.14^3 e^{-4.14}/3! \doteq 0.1883,## while ##P(Y=3|S) = 2.07^3 e^{-2.07}/3! \doteq 0.1865.##

The posterior probability of no visiting scholar, given that ##\{ Y = 3 \}##, is
$$P(\bar{S}|Y=3) = \frac{P(Y=3|\bar{S}) P(\bar{S})}{P(Y=3)} = \frac{(0.45)(0.1883)}{(0.45)(0.1883) + (0.55)(0.1865)}.$$
 
  • #31
Ray Vickson said:
It makes perfectly good sense to ask for the posterior probability of a visiting scholar, given that a student answered 3 questions correctly and without making any assumption that the test has 5 questions. Admittedly the problem is more challenging than the "assume 5 questions" version, and may possibly be beyond the ability/knowledge of the OP, but I really do not know what the instructor intended, so having two possible versions cannot hurt...

First of all, if 3 questions are answered correctly the test must contain ##N \geq 3## questions.$$

I think your response is fair and you are definitely right that the wording is open for multiple interpretations. In my book, having an observation is a very big deal and the wording is silent whether or not the student has even received the assignment... all we have to go on is, that the student "is expected to answer 3 questions correctly." It seems to me that clarity is important generally in math and especially so in probability, but this question is not clear.

That said, maybe OP's prof meant to write a good homework question, but was interrupted by a visitor from Switzerland.
 

FAQ: Statistics probability questions

1. What is the difference between statistics and probability?

Statistics is the study of collecting, organizing, analyzing, and interpreting data. Probability, on the other hand, is the measure of how likely an event is to occur based on the possible outcomes. In other words, statistics deals with analyzing data that has already been collected, while probability deals with predicting the likelihood of future events.

2. How do you calculate probability?

To calculate probability, you need to know the total number of possible outcomes and the number of favorable outcomes. The probability is then calculated by dividing the number of favorable outcomes by the total number of outcomes. For example, if you roll a six-sided die, the probability of rolling a 3 would be 1/6, since there is only one favorable outcome out of six possible outcomes.

3. What is the difference between discrete and continuous probability distributions?

Discrete probability distributions deal with outcomes that are countable and have a finite number of possible values. For example, the number of heads when flipping a coin or the number of children in a family. Continuous probability distributions, on the other hand, deal with outcomes that are not countable and have an infinite number of possible values. For example, height, weight, or time.

4. How do you interpret a confidence interval?

A confidence interval is a range of values that is likely to contain the true population parameter with a certain level of confidence. For example, a 95% confidence interval for the mean height of a population would mean that if we were to take multiple samples from the population and calculate the mean height, 95% of those samples would have a mean height within the confidence interval.

5. What is the difference between correlation and causation?

Correlation is a statistical measure that shows the relationship between two variables. It does not imply causation, meaning that just because two variables are correlated does not mean that one causes the other. Causation refers to a direct cause and effect relationship between two variables. In order to establish causation, further research and experiments need to be conducted.

Similar threads

Replies
15
Views
2K
Replies
4
Views
1K
Replies
2
Views
2K
Replies
4
Views
1K
Replies
4
Views
1K
Replies
1
Views
1K
Back
Top