Observing interactions with plots using est. coeff.

FallenApple · Apr 24, 2017

This question has two parts. On for the linear case, and one for the logistic case. Say X is a continuous variable and we want to see how x affects the response when looking between two different groups. Say G1=Group1, G2=Group2

In linear regression, we can plot the regression lines using the estimated coefficients to see if there is an interaction between two different groups. If they are parallel, then that suggests interaction, if they are not, then that suggests the opposite. Then I would check the p value of the coefficient to see if this is really the case.

Is that true? If it is, then why even plot using the estimated coefficients? The p values should be enough. if p!= 0 for the wald test for the interaction term, then there is insufficient evidence for interaction. Is it because if p=0, we still want to see just how much interaction there is? But wouldn't the absolute value of the interaction coeff be a good hint. Or do we still need visualization?What about for logistic regression. So I look at the probability curve, P[Y=1|X,G1] and P[Y=1|X,G2]. If the difference . delta =P[Y=1|X,G1] - P[Y=1|X,G2] is a constant at each X, then does that mean there is no interaction? Is this like the linear case?

Stephen Tashi · Apr 28, 2017

FallenApple said:

Then I would check the p value of the coefficient to see if this is really the case.

Which coefficient are you talking about? - and how do you arrive at a p-value for it?

MarneMath · Apr 29, 2017

It sounds like what the individual is doing is running p models for each category of interaction that the model may have. Then comparing the results of the model by their coefficients, and then derive a conclusion via the p-values. This is not the right approach. You need to instead find he p-value for the difference between the models. There's a standard error estimated between each model type, and it's very possible to get differences at each observed point but for the differences to not be statistically significant.

Wald's test would only be valid if the models are supgroups of each other. (Although don't quote me on that.)

Lastly relying on just p-values is never a good idea. If you can visualize your data, then do it. When your sample is large, nearly everything rejects the null hypothesis.

FallenApple · May 1, 2017

MarneMath said:

It sounds like what the individual is doing is running p models for each category of interaction that the model may have. Then comparing the results of the model by their coefficients, and then derive a conclusion via the p-values. This is not the right approach. You need to instead find he p-value for the difference between the models. There's a standard error estimated between each model type, and it's very possible to get differences at each observed point but for the differences to not be statistically significant.

Wald's test would only be valid if the models are supgroups of each other. (Although don't quote me on that.)

Lastly relying on just p-values is never a good idea. If you can visualize your data, then do it. When your sample is large, nearly everything rejects the null hypothesis.

Ok so basically do a likelihood ratio test between the two models right?

Also, why is visualizing data better than just getting p values from regression models? Is it because visualization looks at the data as it is? So if there is a way to perfectly visualize the data, when we would not need to do the regression at all?

MarneMath · May 1, 2017

As I stated, as your sample size increases, then most statistical test will reject the null hypothesis. Statistical test were designed to be rather sensitive. They weren't meant for millions upon millions of data points. Thus often times, for example, you'll reject Shapiro test, but if you look at the data, it's normal enough. You can even take smaller sub such that 99 times the Shapiro test fails to reject, but if you take the entire sample, it rejects.

Therefore, if possible, it's always good to look at your data and not rely on just statistical test.

Observing interactions with plots using est. coeff.

FAQ: Observing interactions with plots using est. coeff.

What is the purpose of observing interactions with plots using estimated coefficients?

How are estimated coefficients calculated in this context?

What types of interactions can be observed with plots using estimated coefficients?

Can observing interactions with plots using estimated coefficients help with predicting outcomes?

Are there any limitations to using plots to observe interactions with estimated coefficients?

Similar threads

Hot Threads

Recent Insights