Estimating n for 90% Confidence Interval of Dice Roll Summing to 117

In summary: CONTINUOUS random variable X having mean (7/2)*80 = 280 and variance (35/12)*80 = 233.3333. The probability that S80 = 117 is approximately equal to the probability that X is between 116.5 and 117.5. We can use the normal density function to compute this approximation, which is given by f(x|80) = (1/sqrt(2*pi*233.3333)) * e^(-(x-280)^2/2*233.3333). This is the function that we integrate from 116.5 to 117.5 to get the desired probability. In Maple, this
  • #1
Perrault
14
0

Homework Statement



(I translated this from french)

Some n dice are thrown and the sum of all their face values is 117.
Estimate n through a confidence interval of 90%.
In other words, of all the possible rolls of n dice that sum up to 117, find a range in which n should be located 90% of the time.
For example, 20[itex]\leq[/itex]n[itex]\leq[/itex]117, 100% of the time because a 117-dice roll can sum up to 117, but not a 118-dice roll, and 19 dice can give a maximum sum of 114.

Homework Equations



To estimate the population average (here, the population is every possible roll of n dice that sums to 117) while the variance is unknown, and X is normally distributed (which it probably is), we use the following formula
[itex]\frac{X^{―} - \mu}{\sqrt{\frac{S^{2}_{n-1}}{n}}
}[/itex] : T[itex]_{n-1}[/itex]

Where :

X[itex]^{―}[/itex] is the sample average if that has anything to do with anything.
S^{2} _{n-1} would be equal to [itex]\frac{n}{n-1}[/itex] S[itex]^{2}[/itex] where S[itex]^{2}[/itex] would be the sample's variance and n the number of items in the sample.
T[itex]_{n-1}[/itex] is the symbol for Student's t-distribution.

There are five other, closely related, formulae, but this is the one that seems the most reasonable to use.

But I'm not even sure how that can be used to estimate n.


The Attempt at a Solution



What has been shown above probably shows at which point I am lost in this affair. We have to use some distribution, but I'm not even sure my choice is right.

Thanks!
 
Physics news on Phys.org
  • #2


Perrault said:

Homework Statement



(I translated this from french)

Some n dice are thrown and the sum of all their face values is 117.
Estimate n through a confidence interval of 90%.
In other words, of all the possible rolls of n dice that sum up to 117, find a range in which n should be located 90% of the time.
For example, 20[itex]\leq[/itex]n[itex]\leq[/itex]117, 100% of the time because a 117-dice roll can sum up to 117, but not a 118-dice roll, and 19 dice can give a maximum sum of 114.

Homework Equations



To estimate the population average (here, the population is every possible roll of n dice that sums to 117) while the variance is unknown, and X is normally distributed (which it probably is), we use the following formula
[itex]\frac{X^{―} - \mu}{\sqrt{\frac{S^{2}_{n-1}}{n}}
}[/itex] : T[itex]_{n-1}[/itex]

Where :

X[itex]^{―}[/itex] is the sample average if that has anything to do with anything.
S^{2} _{n-1} would be equal to [itex]\frac{n}{n-1}[/itex] S[itex]^{2}[/itex] where S[itex]^{2}[/itex] would be the sample's variance and n the number of items in the sample.
T[itex]_{n-1}[/itex] is the symbol for Student's t-distribution.

There are five other, closely related, formulae, but this is the one that seems the most reasonable to use.

But I'm not even sure how that can be used to estimate n.


The Attempt at a Solution



What has been shown above probably shows at which point I am lost in this affair. We have to use some distribution, but I'm not even sure my choice is right.

Thanks!

I don't think the student t-distribution has anything to do with this problem. We can regard N (the number of dice) as a random variable with some prior distribution, say uniform over some large interval. We toss N dice and observe a total = 117. We could use a computer algebra system to work out the probabilities that the sum of n dice is 117, for {N=n} between 20 and 117, but it is easier to work with a normal approximation.

Given n, we let f(x|n) = the normal density at x having mean m = (7/2) n and variance = (35/12)*n; these are the exact mean and variance of Sn = sum of n dice values. We can regard our observation as verifying {116.5 <= Sn <= 117.5}, and
[tex] P\{116.5 <= S_n <= 117.5 \} = \int_{116.5}^{117.5} f(x|n) \, dx \doteq f(117|n), [/tex]
to good approximation. So, given the observation, we can regard N has having a posterior distribution
[tex] P(n) = \frac{ f(117|n)}{\sum_{k=20}^{117} f(117|k) }.[/tex]
In this formula we use
[tex] f(117|n) = \frac{1}{\sqrt{2 \pi n}} \exp\left(-\frac{1}{2} \frac{(117 - (7/2) n)^2}{(35/12)n} \right). [/tex]
You want to determine an interval [itex] I=\{n_1, n_1 + 1, \ldots, n_2 \}[/itex] such that [itex] \sum_{n \in I} P(n) \geq 0.90,[/itex] and presumably you would like I to be as short as possible, or nearly so.

RGV
 
  • #3


Ray Vickson said:
[tex] P\{116.5 <= S_n <= 117.5 \} = \int_{116.5}^{117.5} f(x|n) \, dx \doteq f(117|n), [/tex]
This probability isn't too hard to calculate exactly with a scripting language such as perl or python. Simply use the generating polynomial [itex](x+x^2+x^3+x^4+x^5+x^6)/6[/itex]. The probability of getting a sum of 117 with N rolls is the coefficient of [itex]x^{117}[/itex] of the polynomial [itex](x+x^2+x^3+x^4+x^5+x^6)^N/6^N[/itex].
 
  • #4


D H said:
This probability isn't too hard to calculate exactly with a scripting language such as perl or python. Simply use the generating polynomial [itex](x+x^2+x^3+x^4+x^5+x^6)/6[/itex]. The probability of getting a sum of 117 with N rolls is the coefficient of [itex]x^{117}[/itex] of the polynomial [itex](x+x^2+x^3+x^4+x^5+x^6)^N/6^N[/itex].

Right, It is also straightforward in Maple---so one can find q(n) = P{Sn=117} for n from 20 to 117, and work with that array. However, the results are not very different from those obtained from the normal distribution. In particular, one gets the same confidence interval, but with very slightly different probabilities.

RGV
 
  • #5


Hello, and thanks for your replies,

I have trouble understanding your work, I don't understand from here on :
Ray Vickson said:
We can regard our observation as verifying {116.5 <= Sn <= 117.5}, and
[tex] P\{116.5 <= S_n <= 117.5 \} = \int_{116.5}^{117.5} f(x|n) \, dx \doteq f(117|n), [/tex]
to good approximation. So, given the observation, we can regard N has having a posterior distribution

If it's easier to use Maple, how would I do that?

Thanks again!
 
  • #6


Perrault said:
Hello, and thanks for your replies,

I have trouble understanding your work, I don't understand from here on :

If it's easier to use Maple, how would I do that?

Thanks again!

For a given large n, say n = 80, we want to compute the probability that S80 = 117, which is the probability that the sum of 80 dice values equals 117. In principle we could use a computer algebra system to evaluate the probability exactly, but it is almost as good---and much easier---to use a normal approximation. So, S80 is a DISCRETE random variable taking values in the set {80, 81, 82, ... ,480}, and with mean ES80 = 80*3.5 and variance VS80 = (35/12)*80. We want to use the normal distribution with this same mean and variance, but the normal describes a CONTINUOUS random variable X, with P{X = 117} = 0. How can we make a continuous distribution approximate a discrete one? Well, would you not agree that for the discrete random variable S80, the two events {S80 = 117} and
{116.5 < S80 < 117.5} are exactly the same? (After all, S80 can only take integer values!) We cannot replace P{S80 = 17} by P{X = 117}, but we *can* replace P{116.5 < S80 < 117.5} by P{116.5 < X < 117.5}. In principle, we ought to use the exact interval probability (obtained by integrating the normal density f(x) from x = 116.5 to x = 117.5), but we can make a further approximation, and just replace the integral of f(x) over the interval by f(x) at the center of the interval (116.5,117.5), times the length (=1) of the interval; that is, the probability is approximately f(117)*1 = f(117). That is the form used in later calculations.

Note: there is nothing mysterious here: that is what we *always* do when we approximate a discrete probability by a continuous one.

I just used the formulas in my first response and evaluated them all in Maple. I had also done an exact analysis (first, before using the approximation), obtained by getting the exact discrete probabilities P{Sn = 117} for n from 20 to 117. Basically, the Maple commands were:
> f :=1/6*(x+x^2+x^3+x^4+x^5+x^6): #the generating function for 1 die
> #
> # the generating function for n dice is f^n, and the coefficient of x^117 is P{Sn=117}
> #
> for n from 20 to 117 do
> q[n]:=evalf(coeff(expand(f^n),x,117)): end do: n:='n':
> Tot:=add(q[n],n=20..117):
> for n from 20 to 117 do
> P[n]:=q[n]/Tot: end do: n:='n':
Note: it is probably possible to speed this up considerably because we don't really need coefficients for x^k with k > 117, so at each n we can truncate at x^117, then just multiply by f and truncate again, etc. Also, one can keep expression swell down by using evalf at each stage, so the coefficients of x^k are all floats. However, the direct method was fast enough, so I did not bother.

RGV
 
Last edited:
  • #7


Perrault said:
I have trouble understanding your work, I don't understand from here on

What Ray did (and I would have approached this the same way) is to use Bayes' theorem,
[tex]P(A_i|E) = \frac{P(E|A_i)P(A_i)}{\sum_k P(E|A_k)P(A_k)}[/tex]
where
  • [itex]{A_i}[/itex] is a set of mutually exclusive events that collectively span the probability space. In this problem, the events are number of dice rolled N=1, N=2, ..., up to some rather large but finite number.
  • [itex]E[/itex] is some observed event, or evidence. In this problem, the event E is the given fact that the sum of the N dice rolls was 117.
  • [itex]P(A_i|E)[/itex]is the probability of event [itex]A_i[/itex] given the observed event [itex]E[/itex]. For example, what is the probability that the die was rolled 20 times to yield that total of 117? 21 times? These posterior probabilities are the desired quantities.
  • [itex]P(E|A_i)[/itex] is the probability that the observed event [itex]E[/itex] given the event [itex]A_i[/itex].
  • [itex]P(A_i)[/itex] is some estimate of the probability of event [itex]A_i[/itex] without that supporting evidence.
Without any prior supporting evidence, the principle of insufficient reason is about all one can go with: The priors are equiprobable. With this assumption of equal priors, Bayes' law reduces to
[tex]P(A_i|E) = \frac{P(E|A_i)}{\sum_k P(E|A_k)}[/tex]

To illustrate, suppose you were told that the sum of the dice was seven. There are six possible values for N here, N=2 through 7. The probability of rolling seven with two dice (P(S=7|N=2)) is 6/36. Continuing, with this,
P(S=7|N=3)=15/216
P(S=7|N=4)=20/1296
P(S=7|N=5)=15/7776
P(S=7|N=6)=6/46656
P(S=7|N=7)=1/279936

These probabilities of course don't sum to one. There's no reason to expect them to do so. They instead sum to 70993/279936. This is in effect the normalization factor that let's us scale the posterior probabilities so they sum to one. With this scaling,

P(N=2|S=7)=0.6571915540968828
P(N=3|S=7)=0.2738298142070345
P(N=4|S=7)=0.06085106982378544
P(N=5|S=7)=0.00760638372797318
P(N=6|S=7)=0.0005070922485315454
P(N=7|S=7)=1.408589579254293e-05

So just N=2 and N=3 in this case alone give that 90% confidence interval (P=93.1%) in this case. This obviously is not the answer you want as there is no way to roll a sum of 117 with just 2 or 3 rolls of the dice.
 

Related to Estimating n for 90% Confidence Interval of Dice Roll Summing to 117

1. What is the purpose of estimating n for a 90% confidence interval of dice roll summing to 117?

The purpose of estimating n for a 90% confidence interval of dice roll summing to 117 is to determine the minimum number of times the experiment (rolling dice and summing the results) needs to be repeated in order to have 90% confidence that the true mean sum will fall within the calculated interval.

2. How is n calculated for a 90% confidence interval of dice roll summing to 117?

To calculate n for a 90% confidence interval of dice roll summing to 117, you need to know the standard deviation of the dice rolls and the desired margin of error. The formula for n is n = (z-score)^2 * (standard deviation)^2 / (margin of error)^2, where the z-score for a 90% confidence interval is 1.645.

3. What is the significance of a 90% confidence interval for dice roll summing to 117?

A 90% confidence interval means that there is a 90% probability that the true mean sum of dice rolls will fall within the calculated interval. This allows us to make more accurate predictions and inferences about the larger population from which the dice rolls were taken.

4. How does changing the confidence level affect the estimated n for dice roll summing to 117?

The higher the desired confidence level, the larger the n value will need to be in order to have that level of confidence in the estimated interval. For example, a 95% confidence level will require a larger n than a 90% confidence level.

5. What factors can affect the accuracy of the estimated n for a 90% confidence interval of dice roll summing to 117?

The accuracy of the estimated n can be affected by the sample size, the variability of the dice rolls, and the desired level of confidence. A larger sample size and lower variability will result in a more accurate estimate of n for a given confidence level.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
747
  • Precalculus Mathematics Homework Help
2
Replies
53
Views
6K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
726
  • Calculus and Beyond Homework Help
Replies
3
Views
810
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
825
Replies
3
Views
1K
  • Calculus and Beyond Homework Help
Replies
10
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
791
  • Calculus and Beyond Homework Help
Replies
21
Views
2K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
Back
Top