Birthday problem (probability)

  • Thread starter r0bHadz
  • Start date
  • Tags
    Probability
In summary, we can use the complementary value to solve the equation for r without plug and chugging. By working in logspace, we can avoid any potential overflow and easily compute the answer by hand for the numerator. We can then exponentiate to find the final answer.
  • #1
r0bHadz
194
17

Homework Statement


a)
Consider a class with 30 students. Compute the probability that at least two of them
have their birthdays on the same day. (For simplicity, ignore the leap year).

b)
How many students should be in class in order to have this probability above 0.5?

Homework Equations

The Attempt at a Solution


the answer to a is

P(at least two have birthday on same day) = 1 - P(none have bday on same day)

= 1 - ( (365)(364)...(336) / (365)^30 ) = about .71

Honestly, just computing the first answer was tedious enough. I had to multiply everything out and it took forever. So I decided to write a python script to make it faster.

Now for the second one it is

solving the equation 1- P(365,r)/365^30 > .5 => (365-r)! > (364!)/( (.5)(365)^29 )

but there must be some way to solve this without plug and chugging but my IQ is limiting me from arriving to how. Of course I can just plug numbers in but that doesn't seem practical. And my school doesn't allow graphing calculators, but even if it did I doubt the calc wouldn't overflow with numbers so large
 
Physics news on Phys.org
  • #2
r0bHadz said:
but there must be some way to solve this without plug and chugging but my IQ is limiting me from arriving to how. Of course I can just plug numbers in but that doesn't seem practical. And my school doesn't allow graphing calculators, but even if it did I doubt the calc wouldn't overflow with numbers so large

I'm not sure what your question is...

Let's start with an easier problem. If there are 30 students in a class, what's the expected number of pairs of people having the same birthday? (These are pairwise comparisons... so if Bob, Alice and RobHadz all have a birthday on Jan 2nd, then this is 3 'collisions')

Can you answer this?
 
  • #3
StoneTemplePython said:
I'm not sure what your question is...

Let's start with an easier problem. If there are 30 students in a class, what's the expected number of pairs of people having the same birthday? (These are pairwise comparisons... so if Bob, Alice and RobHadz all have a birthday on Jan 2nd, then this is 3 'collisions')

Can you answer this?

The question would be: how can I find a way to solve for r without just plug and chugging? Is there some algebraic manipulation or something that I can do?

And I don't think I can answer that yet. The next chapter in my book is on expectation and variance. Without this knowledge, and just based on the day to day definition of "expectancy" I would say its well below .5 though
 
  • #4
r0bHadz said:
The question would be: how can I find a way to solve for r without just plug and chugging? Is there some algebraic manipulation or something that I can do?

And I don't think I can answer that yet. The next chapter in my book is on expectation and variance. Without this knowledge, and just based on the day to day definition of "expectancy" I would say its well below .5 though

Ok then your underlined question exceeds your current grasp... this is a model problem for Poisson approximations (which you can easily get the answer for, working in logspace, and then plug in once to the actual formula to corroborate the exact solution). But you really should lock down the expected value first. (i.e. ##\lambda = \text{expected value}##)

The birthday problem is a classic, and is used in many places (e.g. modelling collisions in hashtables, if you know some CS). Suggestion: return to this problem after you've gone through the section on expected values.

note: even for plugging and chugging, you can be smart about it -- e.g. do binary search while plugging. So you had to explicitly solve 30 students and found that is too much to get the the 50% probability, so then try 15 (which is too little) so then try 22 or 23...

r0bHadz said:
solving the equation 1- P(365,r)/365^30 > .5 => (365-r)! > (364!)/( (.5)(365)^29 )

but there must be some way to solve this without plug and chugging but my IQ is limiting me from arriving to how. Of course I can just plug numbers in but that doesn't seem practical. And my school doesn't allow graphing calculators, but even if it did I doubt the calc wouldn't overflow with numbers so large

ok consider the complementary value: ## \frac{P(365,r)}{365^{30}}##. Why are you talking about overflow? You should know to work in logspace for this... you might have to compute a few things by hand for the numerator (save the cumulative results in a table!) and obviously ##\log\big(365^{30}\big) = 30 \cdot \log\big(365\big)##... so compute the answer in logspace and exponentiate your way out at the end.
 
Last edited:
  • #5
Ok, I have another idea of how to get to the Poisson approximation result without mentioning expected values. I'm a touch leery of this as a 'precalculus' forum idea, but

are you familiar with the result that ##e^x \approx 1+x## for 'small' ##x##?
 
Last edited:
  • #6
Maybe another rule of thumb or useful result is : ##0.71^2 =0.49 ## close to 0.5. So you got this close with just 30 birthdays. So you need a new product involving 335, 335, etc. , to get close to .71. Sure they will . But still, I checked online and it seems like 23 people are enough for .49.
 
  • Like
Likes r0bHadz
  • #7
Yes, it makes sense that e^x is approx = to 1+x for |x|< approx .5 and is especially close as x approaches 0

WWGD, that solution is brilliant.
 
  • Like
Likes WWGD
  • #8
r0bHadz said:
Yes, it makes sense that e^x is approx = to 1+x for |x|< approx .5 and is especially close as x approaches 0

ok so, write it out as a loose approximation for now

##\frac{P(365,r)}{365^r} = \big(\frac{365}{365}\big)\big(\frac{364}{365}\big)\big(\frac{363}{365}\big)...\big(\frac{365-(r-1)}{365}\big) = \big(1 - \frac{0}{365}\big)\big(1 - \frac{1}{365}\big)\big(1-\frac{2}{365}\big)...\big(1-\frac{(r-1)}{365}\big)##
##\approx e^{-\frac{0}{365}}\cdot e^{-\frac{1}{365}} \cdot e^{-\frac{2}{365}}\cdot ...\cdot e^{-\frac{(r-1)}{365}} = e^{\frac{-1}{365}(0 + 1 + 2 + ... + (r-1)} ##

do you know how to get ##\big(0 + 1 + 2 + ... + (r-1)\big)## in closed form? (Hint: triangular number)
 
Last edited:
  • Like
Likes SammyS and r0bHadz
  • #9
r0bHadz said:

Homework Statement


a)
Consider a class with 30 students. Compute the probability that at least two of them
have their birthdays on the same day. (For simplicity, ignore the leap year).

b)
How many students should be in class in order to have this probability above 0.5?

Homework Equations

The Attempt at a Solution


the answer to a is

P(at least two have birthday on same day) = 1 - P(none have bday on same day)

= 1 - ( (365)(364)...(336) / (365)^30 ) = about .71

Honestly, just computing the first answer was tedious enough. I had to multiply everything out and it took forever. So I decided to write a python script to make it faster.

Now for the second one it is

solving the equation 1- P(365,r)/365^30 > .5 => (365-r)! > (364!)/( (.5)(365)^29 )

but there must be some way to solve this without plug and chugging but my IQ is limiting me from arriving to how. Of course I can just plug numbers in but that doesn't seem practical. And my school doesn't allow graphing calculators, but even if it did I doubt the calc wouldn't overflow with numbers so large

I think you are expecting too much. Sometimes the most practical way of getting an "exact" solution to a problem is to just go ahead and compute/plot and see where your resulting curve crosses the point of interest. So, if P(n) is the probability of (at least one) match in a group of size n, just plot P(n) vs. n and see where P(n) goes from < 0.5 to >= 0.5. If (as you say) you can write a Python script, that should be a snap. If you have access to any reasonable spreadsheet (EXCEL or an open-source alternative) that should be easy as well.

If you want some type of "symbolic" solution, involving nice functions and seemingly-nice equations you are likely out of luck, but various types of approximations (as suggested by others) may be available.
 
  • Like
Likes r0bHadz and StoneTemplePython
  • #10
You can do step by step calculation for n people.

Make two functions
Yes(n)= probability that at least two of them have birthday on the same day
No(n)=probability that everyone have the different birth day
You ca see that Yes(n)+No(n) =1 for any n

Yes(1)=0 - it is only one person . Can't have the same birthday as somebody else because it is only one person
No(1)=1 - Similar funny thinking

Yes(n)= Yes(n-1)+No(n-1) x (n-1)/365
Some of n people have a birthday on the same day if
- if some of n-1 people have a birthday on the same day and n-th person doesn't matter
- and if none of n-1 people doesn't have the same birthday but n-th person have the birthday on the same day as someone of them (n-1)/365

Also none of n people have a birthday on the same day if none of n-1 people nave a birthday on the same day
and n-th person have a birthday on some of other 365-(n-1) days
No(n)=No(n-1) x (365-n+1)/365

Excel Sheet ...
n 365-(n-1) No(n)
1 365 1.000

2 364 0.997
3 363 0.992
4 362 0.984
5 361 0.973
6 360 0.960
7 359 0.944
8 358 0.926
9 357 0.905
10 356 0.883
11 355 0.859
12 354 0.833
13 353 0.806
14 352 0.777
15 351 0.747
16 350 0.716
17 349 0.685
18 348 0.653
19 347 0.621
20 346 0.589
21 345 0.556
22 344 0.524
23 343 0.493 => Yes (23) = 1- 0.493 > 0.5
24 342 0.462
25 341 0.431
26 340 0.402
27 339 0.373
28 338 0.346
29 337 0.319
30 336 0.294
 
  • Like
Likes SammyS

FAQ: Birthday problem (probability)

What is the Birthday problem?

The Birthday problem, also known as the Birthday paradox, is a probability problem that examines the likelihood of two or more people sharing the same birthday in a group of individuals.

What is the formula for calculating the probability in the Birthday problem?

The formula for calculating the probability in the Birthday problem is P(n) = 1 - (365!/[(365-n)! * 365^n]), where n is the number of people in the group. This formula takes into account the number of possible combinations of birthdays in a group of n people.

What is the significance of the number 23 in the Birthday problem?

The significance of the number 23 in the Birthday problem is that it is the minimum number of people needed in a group for there to be a greater than 50% chance of two people sharing the same birthday. This is due to the fact that there are 365 possible birthdays and as the number of people increases, the probability of a shared birthday also increases.

How accurate is the Birthday problem?

The Birthday problem is a mathematical concept and is not meant to accurately predict individual birthdays. It is a theoretical probability and does not take into account factors such as leap years, varying birth rates, and cultural differences in birthday distribution. Therefore, its accuracy may vary in real-life situations.

What other applications does the Birthday problem have?

The Birthday problem has applications in fields such as cryptography, where it is used to analyze the likelihood of two people sharing the same encryption key. It is also used in data analysis and can help determine the probability of duplicate records in a database. Additionally, it can be applied in scheduling and event planning to avoid conflicts in birthdays among a group of people.

Similar threads

Replies
3
Views
1K
Replies
12
Views
3K
Replies
2
Views
2K
Replies
12
Views
4K
Replies
3
Views
5K
Replies
7
Views
2K
Replies
2
Views
1K
Back
Top