- #1
NdotA
- 18
- 0
Hi,
seems to me I cannot get a grip on the maths behind the statistics and my head is dizzy from all the terms and definitions that I do not know head or tail any more. So I would like to have someone check my approach and point me the right way if I am wrong. Thanks in advance for this.
Problem:
In clinical studies to determine the efficacy of a certein treatment I have two groups, one receiving a placebo, the other the remedy. The difference of the two groups is usually attributed to the remedy, but I have my doubts on this. Let us assume the incidence in the population is 100,000 infected people out of which I took two samples of 100 persons each for placebo and verum groups. The criterion that I use to determine efficacy is the number of patients that recover within say two weeks. Let's assume that I find 30 % of the patients in placebo group meet this criterion. What is the percentage in verum-group that would indicate efficacy of the treatment ? What if verum-group came out with say 45 % ?
Approach 1: Pearson's Chi-squared test
My null hypothesis is that nothing did change, that is, the treatment did not have any influence on the result. As I do not know the proportion of instant healings in the population my estimate is that this equals the proportion in my placebo group. So I do the calculation
X2 = (0.45 - 0.30)2/ 0.30 + (0.55 - 0.70)2/0.70 = 0.107
For a single degree of freedom this gives p = 0.90 (estimated, from graph, not exact). So I would assume that I do not have enough significance to discard my null-hypothesis.
What astonishes me, if this was a proper approach, that the size of the groups does not affect significance this way. If I had done the study with only 10 people in each group or as much as 1000, the resulting significance would be the same.
Approach 2: Consider size of samples
I found the formula on wikipedia that a proportion can be estimated by the results of a sample to lie with 95 % confidence interval given by
^p ± √ 0.25 / n (btw: How to properly write formulas here ?)
So placebo group indicates that with 95 % confidence the proportion of instant healings in the population would be between 0.35 and 0.25. So 0.45 is is not within this interval and therefore most probably indicates some efficacy of the treatment. If I would have obtained this result with groups of 10 only, this interval would be 0,14 to 0.46 which would lead to a different result.
Of course here the size of groups is too small as seen by the big interval.
But what astonishes me here is, that the number of the population does not have any influence. I would guess that the size of my groups compared to my population should influence the result. If my group is 10 % of my population of 1000 it should yield more reliable results than if it was only 0.1 % of a population of 100,000.
Approach 3: estimate of sample size
Edit after I found how to write formulas with latex:
on the Internet I found this formula for the size of samples (without any indication of its source however)
n ≥ [itex]\frac{z²\;θ\;(1\;-\;θ)\;N}{Δθ² \;(N \;- \;1)\; +\; z² \;θ\; (1\; -\; θ)}[/itex]
with
n - size of sample
N - number of population
z - Normal distribution value for level of significance (p = 0.05 -> z = 1.96)
θ - fraction of population
Δθ - width of confidence interval
I could rearrange this formula to determine Δθ for my size of samples and received Δθ for the placebo group to be 9 % and for the verum group roughly 10 %
So with a probability of 95 % the confidence zones for placebo are 25.5 % to 34.5 % and verum 40 % to 50 %. They do not overlap, so I assume there has been some effect of the treatment.
So, anybody there that can point me a way out of this jungle ?
N.A
seems to me I cannot get a grip on the maths behind the statistics and my head is dizzy from all the terms and definitions that I do not know head or tail any more. So I would like to have someone check my approach and point me the right way if I am wrong. Thanks in advance for this.
Problem:
In clinical studies to determine the efficacy of a certein treatment I have two groups, one receiving a placebo, the other the remedy. The difference of the two groups is usually attributed to the remedy, but I have my doubts on this. Let us assume the incidence in the population is 100,000 infected people out of which I took two samples of 100 persons each for placebo and verum groups. The criterion that I use to determine efficacy is the number of patients that recover within say two weeks. Let's assume that I find 30 % of the patients in placebo group meet this criterion. What is the percentage in verum-group that would indicate efficacy of the treatment ? What if verum-group came out with say 45 % ?
Approach 1: Pearson's Chi-squared test
My null hypothesis is that nothing did change, that is, the treatment did not have any influence on the result. As I do not know the proportion of instant healings in the population my estimate is that this equals the proportion in my placebo group. So I do the calculation
X2 = (0.45 - 0.30)2/ 0.30 + (0.55 - 0.70)2/0.70 = 0.107
For a single degree of freedom this gives p = 0.90 (estimated, from graph, not exact). So I would assume that I do not have enough significance to discard my null-hypothesis.
What astonishes me, if this was a proper approach, that the size of the groups does not affect significance this way. If I had done the study with only 10 people in each group or as much as 1000, the resulting significance would be the same.
Approach 2: Consider size of samples
I found the formula on wikipedia that a proportion can be estimated by the results of a sample to lie with 95 % confidence interval given by
^p ± √ 0.25 / n (btw: How to properly write formulas here ?)
So placebo group indicates that with 95 % confidence the proportion of instant healings in the population would be between 0.35 and 0.25. So 0.45 is is not within this interval and therefore most probably indicates some efficacy of the treatment. If I would have obtained this result with groups of 10 only, this interval would be 0,14 to 0.46 which would lead to a different result.
Of course here the size of groups is too small as seen by the big interval.
But what astonishes me here is, that the number of the population does not have any influence. I would guess that the size of my groups compared to my population should influence the result. If my group is 10 % of my population of 1000 it should yield more reliable results than if it was only 0.1 % of a population of 100,000.
Approach 3: estimate of sample size
Edit after I found how to write formulas with latex:
on the Internet I found this formula for the size of samples (without any indication of its source however)
n ≥ [itex]\frac{z²\;θ\;(1\;-\;θ)\;N}{Δθ² \;(N \;- \;1)\; +\; z² \;θ\; (1\; -\; θ)}[/itex]
with
n - size of sample
N - number of population
z - Normal distribution value for level of significance (p = 0.05 -> z = 1.96)
θ - fraction of population
Δθ - width of confidence interval
I could rearrange this formula to determine Δθ for my size of samples and received Δθ for the placebo group to be 9 % and for the verum group roughly 10 %
So with a probability of 95 % the confidence zones for placebo are 25.5 % to 34.5 % and verum 40 % to 50 %. They do not overlap, so I assume there has been some effect of the treatment.
So, anybody there that can point me a way out of this jungle ?
N.A
Last edited: