MHB Binomial Distribution: Likelihood Ratio Test for Equality of Several Proportions

AI Thread Summary
The discussion focuses on constructing a likelihood ratio test to compare voter proportions favoring candidate A across four political wards, using a significance level of α=0.05. The null hypothesis posits that the proportions are equal, while the alternative suggests at least one differs. The likelihood functions under both hypotheses are derived, but the resulting likelihood ratio, λ, appears to exceed 1, which contradicts expected properties. Participants discuss the need to take logarithms for simplification and express concerns about the sensitivity of the resulting test. The conversation emphasizes the importance of correctly deriving the likelihood ratio and its implications for statistical testing.
Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
93
$\newcommand{\szdp}[1]{\!\left(#1\right)}
\newcommand{\szdb}[1]{\!\left[#1\right]}$
Problem Statement: A survey of voter sentiment was conducted in four midcity political wards to compare the fraction of voters favoring candidate $A.$ Random samples of $200$ voters were polled in each of the four wards. The numbers of voters favoring $A$ in the four samples can be regarded as four independent binomial random variables. Construct a likelihood ratio test of the hypothesis that the fractions of voters favoring candidate $A$ are the same in all four wards. Use $\alpha=0.05.$

Note 1: This is essentially Exercise 10.88 in Mathematical Statistics with Applications, 5th Ed., by Wackerly, Mendenhall, and Sheaffer.

Note 2: This is cross-posted here.

My Work So Far: Let $p_i$ be the proportion of voters favoring $A$ in Ward $i.$ So the null hypothesis is that $p_1=p_2=p_3=p_4,$ while the alternative hypothesis is that at least one proportion is different from the others. We have $f$ as the underlying distribution:
$$f(y_i)=\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-{y_i}}.$$
It follows that the likelihood function is
$$L(p_1,p_2,p_3,p_4)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-y_i}}.$$
Then we construct $L\big(\hat\Omega_0\big)$ and $L\big(\hat\Omega\big).$ Note that under the null hypothesis, we will set $p_1=p_2=p_3=p_4=p.$ Hence,
$$L\big(\hat\Omega_0\big)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p^{y_i}(1-p)^{n-y_i}}.$$
The one remaining parameter $p$ we will replace with its MLE, which we can confidently say is $\big(\sum y_i\big)/(4n).$ Hence
\begin{align*}
L\big(\hat\Omega_0\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{\sum
y_i}{4n}}^{\!\!y_i}\szdp{1-\frac{\sum y_i}{4n}}^{\!\!n-y_i}}\\
&=\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\sum
y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}.
\end{align*}
Next, we turn our attention to $L\big(\hat\Omega\big):$
\begin{align*}
L\big(\hat\Omega\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{1-\frac{y_i}{n}}^{\!n-y_i}}\\
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{\frac{n-y_i}{n}}^{\!n-y_i}}\\
&=\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}.
\end{align*}
Next we form the likelihood ratio:
\begin{align*}
\lambda
&=\frac{L\big(\hat\Omega_0\big)}{L\big(\hat\Omega\big)}\\
&=\frac{\displaystyle
\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}
\szdp{\sum y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}}
{\displaystyle
\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}}\\
&=\frac{n^{4n}}{4^n\,n^n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}\\
&=\szdp{\frac{n^3}{4}}^{\!\!n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}.
\end{align*}

My Questions:
1. This looks wrong to me, because I'm told (and it totally makes sense) that $0\le\lambda\le 1,$ whereas everything in sight is greater than $1.$
2. Supposing this expression can be salvaged, what are the next steps? Should I take logs and try to simplify somehow?
3. I'm expecting to be able to obtain a test something along the lines of
$$\frac{(1/(4n))\sum_{j=1}^ny_j-\sum_{j=1}^n(y_j/n)}{\displaystyle\sqrt{\sum_{j=1}^4\dfrac{(y_j/n)(1-y_j/n)}{n}}},$$
although this test doesn't strike me as sensitive enough. We could have $y_1/n$ much too low, and $y_4/n$ much too high, and this test could still mark them down as equal because they "average out" to the right thing. What's the right generalization to the standard difference of proportions test?
 
Physics news on Phys.org
COOLSerdash on CV.SE was able to simplify the likelihood ratio in such a way that I could tell mine is incorrect. Must have messed up in the algebra somewhere.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top