Multivariate hypothesis testing

In summary, hypothesis testing is performed to determine if a given data point is evidence of a hypothesized population distribution.
  • #1
bpet
532
7
How is hypothesis testing performed for multivariate data?

Say for simplicity we have two iid draws from a binomial distribution Bin(10,q) with X1=7, X2=8. Under the null hypothesis H0:q=1/2, the individual p-values (as one-tail probabilities) are approximately 0.172 and 0.055 respectively so neither data point is sufficient evidence on its own to reject the null at the 95% confidence level. What would be the p-value for the pair (7,8) ?
 
Physics news on Phys.org
  • #2
One way to interpret your question is, "what is the sampling distribution generated by n=2, q=0.5?" as in http://faculty.vassar.edu/lowry/binomial.html

OTOH for a joint test of two variables you need to know their joint distribution. In the iid case that's F(x,y)=F(x)F(y).
 
Last edited:
  • #3
EnumaElish said:
One way to interpret your question is, "what is the sampling distribution generated by n=2, q=0.5?" as in http://faculty.vassar.edu/lowry/binomial.html

Thanks though I don't quite understand how you mean to apply this to hypothesis testing.

OTOH for a joint test of two variables you need to know their joint distribution. In the iid case that's F(x,y)=F(x)F(y).

The joint distribution on its own isn't really appropriate because F(x1,...,xn) would be O(1/2^n). For independent rv's I guess the Kolmogorov-Smirnov distance would be useful as for a sample of size 1 it resembles a two-tail test. For non-independent samples I'm still not sure what is suitable.
 
  • #4
Do you care to explain your statement below?
bpet said:
The joint distribution on its own isn't really appropriate because F(x1,...,xn) would be O(1/2^n).
 
  • #5
EnumaElish said:
Do you care to explain your statement below?

Say the variables are independent, as a rough approximation you could say the values are clustered about the median so F(x1,...,xn) ~ (1/2)^n. So the cdf on its own isn't really sufficient to use as a p-value, but I guess the multivariate generalization of the KS statistic could be used - though to calculate the critical values would be quite difficult and probably require Monte-Carlo simulation.

As an example, since the multivariate normal cdf has no closed form, what would be a procedure to test a sample, say the distribution Xi ~ N(0,1) with E[XiXj]=r for i<>j, 1<=i,j<=N when N is large?
 

FAQ: Multivariate hypothesis testing

1. What is "multivariate hypothesis testing"?

Multivariate hypothesis testing is a statistical method used to determine if there is a relationship between two or more variables. It involves testing multiple hypotheses simultaneously to determine if there is a statistically significant relationship between the variables.

2. How is multivariate hypothesis testing different from univariate hypothesis testing?

Univariate hypothesis testing only involves one independent variable and one dependent variable, while multivariate hypothesis testing involves multiple independent and dependent variables. This allows for the examination of complex relationships between variables and can provide more comprehensive results.

3. What types of statistical tests are used in multivariate hypothesis testing?

There are several types of statistical tests that can be used in multivariate hypothesis testing, including multiple regression analysis, analysis of variance (ANOVA), and multivariate analysis of variance (MANOVA). These tests can be used to examine different types of relationships between variables.

4. What are the benefits of using multivariate hypothesis testing?

Multivariate hypothesis testing allows for the examination of relationships between multiple variables, which can provide a more comprehensive understanding of a phenomenon. It also allows for the control of confounding variables, which can improve the accuracy of the results.

5. What are some potential limitations of multivariate hypothesis testing?

One potential limitation of multivariate hypothesis testing is the need for a large sample size in order to accurately test multiple hypotheses. Additionally, the complexity of the statistical tests used may make it difficult to interpret the results for those without a strong background in statistics.

Similar threads

Replies
3
Views
962
Replies
6
Views
1K
Replies
2
Views
2K
Replies
10
Views
3K
Replies
5
Views
3K
Replies
20
Views
3K
Replies
3
Views
2K
Back
Top