Mean time between lottery wins and probability of fraud by organizers

In summary, it seems that a lottery where you pick 5 numbers out of the set (1,2, ..., 50) is being rigged in a way where people are winning too often. It is not clear how to investigate this, but it is probably a fraud.
  • #71
Jonathan212 said:
Can't I just ignore that information and instead give the fact that the binomial distribution in

= 1 - BINOMDIST( M - 1 , N , 0.5, 1 )

is approximated by the normal distribution in

= 1 - NORMDIST( M - 1, N * 0.5, SQRT( N * 0.5 * (1-0.5) ), 1 )

where we'd replace 0.5 by 1/24,435,180 and use N = 10,457,692,468 and M = 465 ?
There it is (bold added by me).
Jonathan212 said:
EDIT: just found the error. You're looking at the "|z| >" value but you should be looking at "z >".
Why? Wouldn't a deviation in the other direction be equally suspicious?
Jonathan212 said:
And because we want 465 or more, ie > 464, you should have calculated how many standard deviations 464 is from 428, not 465 from 428.
Within the approximation from the Poisson distribution or normal distribution this doesn't matter. 464.5 should be slightly better.

WolframAlpha can calculate some extreme values. Check individual parts - you'll see the approximation is a *really* good one here.
 
Physics news on Phys.org
  • #72
Why did you add the bold? To say it is incorrect? This is the formula we derived in the other thread for an identical problem with different N, M and probability. EDIT: it matches WolframAlpha perfectly too, if you type it in Excel.

A deviation in the opposite direction, it too few winning tickets, would not line the pockets of the organizers as easily because there are accountants auditing where the money goes when there is no win - it goes to the next draw.
 
Last edited:
  • #73
Another question is how many digits of this p = 0.040816379 result should we trust. Should the statistical significance be shown as "p < 0.05"?
 
  • #74
Jonathan212 said:
Why did you add the bold? To say it is incorrect?
It is not incorrect. Check how you started the post (it is in the quote). You asked "can I ignore that, and just use [...]", but this "[...]" included the information you asked about.
Jonathan212 said:
A deviation in the opposite direction, it too few winning tickets, would not line the pockets of the organizers as easily because there are accountants auditing where the money goes when there is no win - it goes to the next draw.
A larger jackpot tends to attract more players, which means a larger profit for the organizers.
Jonathan212 said:
Another question is how many digits of this p = 0.040816379 result should we trust.
Certainly don't use more than two significant figures. p=0.041 looks good, p=0.04 is not bad either. It is not small enough to claim fraud, especially as we know there are factors that make us underestimate the p-value.
 
  • #75
Does a question like "what is the probability that the organizers have never cheated by adding a winner after a draw?" make sense mathematically?
 
  • #76
Lottery wins may not be analyzed assuming they are a fair game.

Winnings are not allowed to happen randomly because the innumerate general public would misinterpret that as fraud. The lottery commissions use internal secret algorithms to ensure that the distribution of locations and dates of wins meet the appearance of what the general public assumes is randomness by suppressing variance and fluctuations in order to get a more balanced spread of winning locations and times avoiding unfair looking distributions where locations win too much or too little.

Methodically thinking about the optimum algorithm and process by which the lottery commission might ensure this controlled pseudorandom distribution of wins, Joan Ginther*, former math professor with a PhD from Stanford University specializing in statistics, won four Texas lotteries (total over $20 million).

"The Luckiest Woman on Earth", Harper's Magazine AUG-2011
 
Back
Top