- #36
StoneTemplePython
Science Advisor
Gold Member
- 1,265
- 597
Jonathan212 said:Look at some more numbers. All at 60% heads or more:
Probability of 15 or more heads in 25 throws = 1 in 4.7
Probability of 60 or more heads in 100 throws = 1 in 35
Probability of 150 or more heads in 250 throws = 1 in 1,062
Probability of 240 or more heads in 400 throws = 1 in 27,000
Probability of 300 or more heads in 500 throws = 1 in 220,000
Probability of 360 or more heads in 600 throws = 1 in 1,800,000
Probability of 480 or more heads in 800 throws = 1 in 1,200,000,000
In some fundamental sense, the problem is that you are looking at this the wrong way -- you are looking at this in terms of some kind of constant percentage -- i.e. 60% of n.
A much improved way to look at this is consider your threshold (e.g. at least 15 heads ) vs the 'distance' from that to the mean. The distance involved is called the amount of standard deviations from the mean. If you re-visit these numbers, you'll see that you're progressively increasing the distance between the threshold and the mean as you go down this table. Look at a picture of the bell curve and you can see how it rapidly drops off the farther from the mean you go... the result here should not surprise you if you think about it this way (and know/assume that a normal approximation is reasonable).
So, here's quick look:
## \left[\begin{matrix}
\text{n} & \geq\text{threshold} & \text{raw excess over mean} & \text{ std deviations from the mean} &\text{normal approximation} \\
25 & 15 & 2.5 & 1 & 0.24 \\
100 & 60 & 10 & 2 & 0.027 \\
250 & 150 & 25 & 3.16 & 0.00085 \\
\vdots& \vdots& \vdots & \vdots & \vdots \\
500 & 300 & 50 & 4.47 & 4.0 \cdot 10^{-6} \\
1000 & 600 & 100 & 6.32 & 1.3 \cdot 10^{-10} \\\end{matrix}\right]##
for a binomial, you have
##\text{mean}= \mu = n \cdot p## and
##\text{variance}= n \cdot p(1-p) \to \text{standard deviation}= \sqrt{n \cdot p(1-p)}##
where the normal approximation used was:
##P(X\geq c) = P(X\gt c) \sim \frac{1}{c\sqrt{2 \pi}} e^{-\frac{c^2}{2}}##
(reference post #26 for more information)
and ##c## is precisely the number of standard deviations from the mean that your threshold is.
(Note: we can fine tune this with the 1/2 / continuity correction as is common for approximating binomials with standard normals, among other techniques though I skipped it them for convenience. A more meticulous approach would also bound the error made when approximating a binomial by a normal. )
Last edited: