A Proof on hypergeometric distribution

In summary: Simplifying, we get 1 = (r C N) + 0, which means (r C N) = 1. This is the desired result, as the sum of all the coefficients on the left side is (N C 0) + (N C 1) + ... + (N C N) = 2^N, which is the total possible outcomes in the hypergeometric distribution. Thus, in summary, we have shown that the set of probabilities associated with the hypergeometric distribution sum to one by equating the coefficients of the expanded identity and using the fact that (r C N) = 1.
  • #1
irony of truth
90
0
Show directly that the set of probabilities associated with the hypergeometric distribution sum to one.

=> I am thinking that this tells me to prove that since this is a probability distribution function, it really should sum to 1. Is that what the problem asking me to do? =)

I got this given hint in the book that I should expand the identity:
(1 + 'mu')^N = (1 + 'mu')^r (1 + 'mu')^(N-r) and equate the coefficients.

Also, how should I equate the coefficients? Should I make the 'mu' arbitrarily equal to -1? What I did is that I expanded the left side of the given equation as 1 + (N C 1)'mu' + (N C 2)'mu'^2 + ... + (N C N)'mu'^N. That's why I got stucked on thinking... how about the coefficients?
 
Physics news on Phys.org
  • #2
Yes, the problem is asking you to show that the set of probabilities associated with the hypergeometric distribution sum to one. To do this, we can simply expand the identity given in the hint: (1 + 'mu')^N = (1 + 'mu')^r (1 + 'mu')^(N-r). Equating the coefficients of each side of the equation will give us the desired result. On the left side, we have: 1 + (N C 1)'mu' + (N C 2)'mu'^2 + ... + (N C N)'mu'^N. On the right side, we have: (1 + 'mu')^r (1 + 'mu')^(N-r) = [1 + (r C 1)'mu' + (r C 2)'mu'^2 + ... + (r C r)'mu'^r] * [1 + ((N-r) C 1)'mu' + ((N-r) C 2)'mu'^2 + ... + ((N-r) C (N-r))'mu'^(N-r)]. Now, equate the coefficients on both sides of the equation. We can see that the coefficient of 'mu'^0 is 1 on both sides and they will cancel out. The coefficient of 'mu'^1 is (N C 1) on the left side and (r C 1) + ((N-r) C 1) on the right side. Thus, (N C 1) = (r C 1) + ((N-r) C 1). Similarly, the coefficient of 'mu'^2 is (N C 2) on the left side and (r C 2) + ((N-r) C 2) on the right side. Thus, (N C 2) = (r C 2) + ((N-r) C 2). Continuing this way, we will get (N C k) = (r C k) + ((N-r) C k) for all k from 0 to N. Now, let us consider the special case of k=N. We get (N C N) = (r C N) + ((N-r) C N
 
  • #3


Yes, you are correct. The problem is asking you to prove that the sum of probabilities associated with the hypergeometric distribution is equal to 1. This is because, as you mentioned, it is a probability distribution function and the sum of all probabilities in a probability distribution should always be equal to 1.

To prove this, you can use the given hint and expand the identity (1 + 'mu')^N = (1 + 'mu')^r (1 + 'mu')^(N-r). This can be rewritten as:

(1 + 'mu')^N = (1 + 'mu')^r * (1 + 'mu')^(N-r)

Now, let's equate the coefficients on both sides. The coefficient of 'mu'^k on the left side is given by (N C k), which represents the number of ways to choose k objects from a total of N objects. The coefficient of 'mu'^k on the right side is given by (r C k)*(N-r C N-k), which represents the number of ways to choose k objects from a total of r objects and N-k objects from a total of (N-r) objects.

Therefore, we can equate the coefficients as follows:

(N C k) = (r C k)*(N-r C N-k)

Now, let's substitute k = 0, 1, 2, ..., N in this equation. We will get N+1 equations in total. When we add all these equations, we get:

(N C 0) + (N C 1) + (N C 2) + ... + (N C N) = (r C 0)*(N-r C N) + (r C 1)*(N-r C N-1) + (r C 2)*(N-r C N-2) + ... + (r C N)*(N-r C 0)

Simplifying this equation, we get:

1 + (N C 1) + (N C 2) + ... + (N C N) = (r C 0)*(N-r C N) + (r C 1)*(N-r C N-1) + (r C 2)*(N-r C N-2) + ... + (r C N)*(N-r C 0)

But, we know that (r C 0) = 1, (N-r C N) =
 

FAQ: A Proof on hypergeometric distribution

What is a hypergeometric distribution?

A hypergeometric distribution is a probability distribution that describes the number of successes in a sequence of draws from a finite population without replacement. It is used when the sample size is relatively small compared to the population size and when the events of interest are not independent.

How is a hypergeometric distribution different from a binomial distribution?

A binomial distribution is used when the sample size is relatively large compared to the population size and when the events of interest are independent. In a hypergeometric distribution, the probability of success changes for each draw because the sample is taken without replacement, whereas in a binomial distribution, the probability of success remains constant for each trial.

What is the formula for calculating the hypergeometric distribution?

The formula for calculating the hypergeometric distribution is: P(X = k) = (C(a, k) * C(N-a, n-k)) / C(N, n), where P(X = k) is the probability of getting exactly k successes, N is the population size, n is the sample size, and a is the number of successes in the population.

How is the hypergeometric distribution used in real life?

The hypergeometric distribution is commonly used in quality control and market research to analyze data from small samples without replacement. It is also used in genetics and ecology to study the distribution of traits or species within a population.

Can the hypergeometric distribution be approximated by other probability distributions?

Yes, when the population size is very large, the hypergeometric distribution can be approximated by the binomial distribution. Similarly, when the sample size is small compared to the population size, it can be approximated by the Poisson distribution.

Similar threads

Replies
24
Views
674
Replies
5
Views
950
Replies
10
Views
764
Replies
7
Views
593
Replies
7
Views
1K
Replies
11
Views
2K
Replies
15
Views
3K
Back
Top