Stats question- likelihood function for a ratio

In summary, genetic linkage theory suggests that observed frequencies of four phenotypes resulting from crossing tomato plants are in the ratio 9/16 + a: 3/16 - a: 3/16 - a: 1/16 + a. Using the likelihood function, it is found that the most likely outcome is that of a:3/16.
  • #1
Zoe-b
98
0

Homework Statement


According to genetic linkage theory, observed frequencies of four phenotypes
resulting from crossing tomato plants are in the ratio 9/16 + a : 3/16 - a : 3/16 - a : 1/16 + a.
In 1931, J.W. MacArthur reported the following frequencies:
Observed
Phenotype frequency
Tall, cut-leaf 926
Tall, potato-leaf 288
Dwarf, cut-leaf 293
Dwarf, potato-leaf 104
Total 1611

Write down the likelihood of a given these observations. Find the maximum likeli-
hood estimate of a, use it to calculate expected frequencies for the four phenotypes
and compare them with the observed frequencies. Does genetic linkage theory look
plausible?

Homework Equations


likelihood of a data set (x1...xn) occurring is the product of fX(x), if Xi are independent. Then I could find max likelihood etc as usual


The Attempt at a Solution


Basically I just cannot start this problem at all. I have only ever found likelihood before as a function producing one value, here I seem to want to find the probability that the ratio is b:c:d:e which I have no idea how to do. Also surely I need to know something about the variance of a? At least then I could write down a probability to do with each bit of the ratio?
Sorry if I'm not making much sense/if this is obvious but I have spent a long time attempting this with no luck. Google hasn't helped at all either..
 
Physics news on Phys.org
  • #2
maybe let's simplify and see how we would do it for a two variable case and see if we can build up on that

so let's say you have a discrete probability distribution with one of 2 outcomes B & C, with probabilities P(B)=b and so on with c=1-b

Now let's say you roll the dice n number of times, in this case the probabilty distribution is binomial and the probabilty of getting k B events, given a probabilty B is
[tex]
P(N_B=k|b) = \frac{n!}{k!(n-k)!}(1-b)^{n-k}b^k
[/tex]

but this is just the likelihood of the estimator b, so now
[tex]
L(b|N_B=k) = P(N_B=k|b) = \frac{n!}{k!(n-k)!}(1-b)^{n-k}b^k
[/tex]

Taking the logarithm
[tex]
ln\{L(b|N_B=k)\} = ln(\frac{n!}{k!(n-k)!})+(n-k)ln(1-b)+k.ln(b)
[/tex]

Differentiating to find the MLE gives
[tex]
-\frac{n-k}{1-b} +\frac{k}{b}=0
[/tex]

Note that the logarithm and and differentiation mean multiplicative constants fall away as they don't change the form of the likelihood function, but just normalise
[tex]
k(1-b)=(n-k)b
[/tex]

Giving
[tex]
b=\frac{k}{n}
[/tex]

which is what we would have guessed anyway, but hopefully you can build from there.
 
Last edited:
  • #3
the tricky part will is finding that probabilities, and this is only a start, but let's call each outcome B,C,D,E eg. getting B=b, means out of n trials you found b "Tall, cut-leaf" plants

Lets deal just with the B information, the first observation, first we can treat this as a binomial distribution, with n trials and b successes
[tex]
P(B=b|a) = \frac{n!}{b!(n-b)!}(1-\frac{9}{16} - a)^{n-b}(\frac{9}{16} + a)^b
[/tex]

Now let's look at adding the C information, however the outcome for C is not independent from the B outcome, so I'm thinking you can probably stack them up using conditional probabilty
[tex]
P(B=b,C=c|a) = P(C=c|B=b,a)P(B=b|a)
[/tex]

Note that if B & C were independent [itex] P(C=c|B=b,a)=P(C=c|a)[/itex] and this would reduce to the multiplicative form above, but this is not the case and [itex] P(C=c|B=b,a)\neq P(C=c|a)[/itex] .

We just found [itex] P(B=b|a)[/itex], now if we take B=b can you find [itex] P(C=c|B=b,a)[/itex]?

then you can repeat and stack up the last piece of information as
[tex]
P(B=b,C=c,D=d|a) = P(D=d|C=c,B=b,a)P(C=c|B=b,a)P(B=b|a)
[/tex]
 
Last edited:
  • #4
Zoe-b said:
Basically I just cannot start this problem at all. I have only ever found likelihood before as a function producing one value, here I seem to want to find the probability that the ratio is b:c:d:e which I have no idea how to do.

Hi Zoe-b! :smile:

Can you deduce from your ratio what the chances are on each of your phenotypes (as functions of a)?

Suppose you have a specific set of those plants with in total nB, nC, nD, and nE of each type.
What would the chance be on that specific set?

Note that this chance defines your likelihood as function of a.
 
  • #5
Zoe-b said:
Basically I just cannot start this problem at all. I have only ever found likelihood before as a function producing one value, here I seem to want to find the probability that the ratio is b:c:d:e which I have no idea how to do. Also surely I need to know something about the variance of a? At least then I could write down a probability to do with each bit of the ratio?
Sorry if I'm not making much sense/if this is obvious but I have spent a long time attempting this with no luck. Google hasn't helped at all either..

I'm not too sure if the variance of a makes sense here, but at the end of the day you will have the likelihood function for a which should help you decide whether the value of a to support the theory is reasonable given the data
 

Related to Stats question- likelihood function for a ratio

1. What is a likelihood function?

A likelihood function is a statistical method used to estimate the likelihood of observing a certain set of data given a particular model or hypothesis. It measures how well the model fits the data and is commonly used in maximum likelihood estimation.

2. How is a likelihood function calculated for a ratio?

To calculate a likelihood function for a ratio, you would first determine the probability of observing each data point or outcome in the ratio. Then, you would multiply these probabilities together to get the likelihood of the entire ratio. This can also be expressed as the product of the individual likelihood functions for each data point.

3. What is the role of the likelihood function in statistics?

The likelihood function plays a crucial role in statistics as it is used to estimate the parameters of a statistical model. It helps determine the most likely values for the parameters by maximizing the probability of observing the data given the model. This allows for more accurate and reliable statistical inferences.

4. Can the likelihood function be used to compare different models?

Yes, the likelihood function can be used to compare different models by calculating the likelihood for each model and then comparing them. The model with the highest likelihood is considered the best fit for the data. However, it is important to note that the likelihood function alone is not enough to determine the best model and should be used in conjunction with other statistical methods.

5. Are there any limitations to using the likelihood function for a ratio?

One limitation of using the likelihood function for a ratio is that it assumes that the data points are independent and identically distributed, which may not always be the case. Additionally, the likelihood function may not be able to capture complex relationships between variables and may not be suitable for non-linear data. It is important to consider these limitations when using the likelihood function for a ratio.

Back
Top