Regarding Simulations and Sample Size

  • #1
Agent Smith
197
20
Thread moved from the technical forums to the schoolwork forums
TL;DR Summary: Sims and sample size

A statistics question I have in my notes goes like this:

Our significance level ##\alpha = 0.01##
The percentage of left-handed people in the general population is ##10\%##. Liliana is curious if this is true for her arts class and so she takes a random sample of ##8## [please note this number] students from her arts class and finds that ##1## is left-handed. That is the proportion of lefties in her class is ##0.125##.

The null hypothesis: ##H_0## is that the proportion of lefties in Liliana's class = ##10\%##
The alternative hypothesis: ##H_a## is that the proportion of lefties in Liliana's class ##> 10\%##

She then conducts a 100 simulations, each time taking a sample size of ##8## [please note this number] from a virtual population in which ##10\%## are lefties. It turns out that in ##2## of her simulations the proportion of lefties is ##\geq 0.125##. This means, I'm told, that the probability of getting a proportion of lefties ##\geq 0.125## is ##\frac{2}{100} = 0.02##.

Then, the back-of-the-book answer says, since the ##\text{P-value} = 0.02## and ##\alpha = 0.01## and ##0.02 > 0.01##, we can't reject ##H_0##.

I hope all the above is correct.

My question concerns the sample size ##8## (the number I asked be noted). This sample size is too small for the number of successes and the number of failures to be ##\geq 10## i.e. one condition for inference from the sample is unmet and yet we have made an inference. Am I supposed to conclude that with simulations like the one described above we need not bother about sample size? So for this particular question, if my sample size is ##6##, I need only ensure that the simulation consists of samples of size ##6## and I'll still be able to make legitimate inferences from the sim???

N.B. Also if we reset ##\alpha = 0.05##, since ##0.02 < 0.05##, we can reject ##H_0## and conclude that Liliana's arts class has an "unusually high number" of lefties, right?
 
Physics news on Phys.org
  • #2
Agent Smith said:
we have made an inference.
No, We have just made an assumption (the null hypothesis) and have not conducted enough testing to reject the assumption at that level of confidence.
Agent Smith said:
Am I supposed to conclude that with simulations like the one described above we need not bother about sample size? So for this particular question, if my sample size is ##6##, I need only ensure that the simulation consists of samples of size ##6## and I'll still be able to make legitimate inferences from the sim???
You can make assumptions (hypothesis) for the purpose of a statistical test with no data at all, but proving that assumption should be rejected at some confidence level requires enough sample data with contrary results.
Agent Smith said:
N.B. Also if we reset ##\alpha = 0.05##, since ##0.02 < 0.05##, we can reject ##H_0## and conclude that Liliana's arts class has an "unusually high number" of lefties, right?
With much less confidence. The chance of a Type I error is increased at the 0.05 level.
 
Last edited:
  • #3
Agent Smith said:
It turns out that in 2 of her simulations the proportion of lefties is ≥0.125.
That is interesting. I just did this same thing, and I got 58 out of 100 with a proportion of at least 0.125. That seems to indicate a coding error in the simulation.

Agent Smith said:
This sample size is too small for the number of successes and the number of failures to be ≥10
This may be a good recommendation for simulations as well. If they had only two successes with 100 simulations they probably should have tried 1000 or 10000 instead.

Agent Smith said:
i.e. one condition for inference from the sample is unmet and yet we have made an inference
Yes, you have made an unreliable inference. That is exactly what those rules are intended to prevent.
 
Last edited:
  • #4
Why would they post a wrong question? 🤔

That's not the only question with a sample that fails the successes ##\geq 10## and failures ##\geq 10## test. Actually there's only ##1## other question, but this time we're checking for mean. @Dale , do you know if there's an interactive website with a Monte Carlo simulation?

I believe the point to note is this is a simulation of a sampling distribution of sample proportions. Could it be that the condition, the distribution of the sample is uniform (no outliers, etc.?). It feels right to me to conduct a simulation in this way, with the simulation sample size = whatever the sample size was initially.
 
  • #5
Capture.PNG


These are the sampling distribution of the sample proportions. Would you agree that the distribution is roughly normal and SO we don't have to concern ourselves with sample size being inadequate to satisfy the succeses ##\geq 10## and failures ##\geq 10## condition for inference.
 
  • #6
Agent Smith said:
View attachment 351854

These are the sampling distribution of the sample proportions. Would you agree that the distribution is roughly normal and SO we don't have to concern ourselves with sample size being inadequate to satisfy the succeses ##\geq 10## and failures ##\geq 10## condition for inference.
This is not the distribution of proportions for a sample size of 8. For a sample size of 8 the possible proportions are only integer multiples of 0.125, and between 0 and 1.

Agent Smith said:
It feels right to me to conduct a simulation in this way, with the simulation sample size = whatever the sample size was initially
I agree. Unless the simulation were intended to explore “what if” for different experimental approaches.
 
Last edited:
  • #7
Dale said:
This is not the distribution of proportions for a sample size of 8. For a sample size of 8 the possible proportions are only integer multiples of 0.125, and between 0 and 1.

I agree. Unless the simulation were intended to explore “what if” for different experimental approaches.
Apologies. Got lost in the woods. However, these are from similar questions, where the sample size is too small for the conditions for inference to be satisfied.

Would you agree that if the distribution is normal, we can ignore the small sample size?
 
  • #8
Agent Smith said:
Would you agree that if the distribution is normal, we can ignore the small sample size?
No. The inferences will be unreliable. Even if it looks more or less normal the mean and standard deviation will not be reliable.

By the way, in my simulation I did not make the normal approximation. I used a binomial distribution. So if their simulation used the normal approximation that could be an issue.
 
  • Like
Likes FactChecker
Back
Top