Confidence interval for a cohort

In summary: Expressing the rows as order pairs (a_i , b_i) , then I have...\bar{x} = (\sum b_i \cdot a_i)/( \sum b_i) , ands^2 = \frac 1{\sum b_i - 1} \cdot \sum b_i (a_i - \bar{x} )^2
  • #1
Mogarrr
120
6

Homework Statement


A cohort of hemophiliacs is followed to elicit information on the distribution of time to onset of
AIDS following seroconversion (referred to as latency time). All patients who seroconvert become
symptomatic within 10 years, according to the distribution in Table 6.11.

Table 6.11 Latency time to AIDS among hemophiliacs who become HIV positive
Latency time (years) Number of patients

Latency Time(years): Number of patients
0: 2
1: 6
2: 9
3: 33
4: 49
5: 66
6: 52
7: 37
8: 18
9: 11
10: 4

(I don't know how to make a proper table with latex... tried \being{tabular}{l r} but this doesn't work)

6.64 Assuming an underlying normal distribution, compute 95% CIs for the mean and variance of
the latency times.

Homework Equations



When the variance is unknown, the t-distribution may be used
[tex] \mu = \bar{x} \pm t_{n,1- \frac {\alpha}2} \cdot \frac {s}{\sqrt {n}} [/tex]

and estimating the variance, we have...

[itex] (n-1) \cdot \frac {s^2}{ \chi^2_{n-1,1- \frac {\alpha}2}} \leq \sigma^2 \leq (n-1) \cdot \frac {s^2}{ \chi^2_{n-1,\frac {\alpha}2}} [/itex]

lastly, for the poisson distribution the confidence interval is given by [itex] \mu_1, \mu_2 [/itex], that satisfies

[itex] \frac {\alpha}2 = P(X \geq \mu | \mu = \mu_1) = \sum_{k=x}^{\infty} \frac {e^{-\mu_1} \mu_1^{k}}{k!}[/itex]

[itex] \frac {\alpha}2 = P(X \leq \mu | \mu = \mu_2) = \sum_{k=0}^{x} \frac {e^{-\mu_2} \mu_2^{k}}{k!}[/itex]

The Attempt at a Solution



I'm not really sure how to handle this. I'm used to just once column where I can compute the mean and sample variance. Here I'm asked to compute the mean and variance of the latency time. Since this is a time interval, I think I should be using the Poisson distribution, however it's given that the distribution is normal.

I don't know how to proceed. Any help would be appreciated.
 
Physics news on Phys.org
  • #2
You can ignore the medical background - it is an interval that always starts at zero, so mean and variance of the interval are the mean and variance of your data. Sure, the values cannot get negative (so it cannot be a perfect gaussian distribution), but that's not important here.
 
  • #3
mfb said:
You can ignore the medical background - it is an interval that always starts at zero, so mean and variance of the interval are the mean and variance of your data. Sure, the values cannot get negative (so it cannot be a perfect gaussian distribution), but that's not important here.

Not sure what you mean by a perfect guassian distribution. Stripping away the medical terminology, I still don't know what to do.

Given that I have two columns with data, I picture (perhaps incorrectly) the first column as values for X, and another column as the associated probabilities. Fom there I don't know what to do.

Perhaps I find an interval for [itex] \mu [/itex], contuining the previous tangent, how could I relate this back to the first column. It's not like I know [itex]f^{-1}(x) [/itex].
 
  • #4
Wait a sec...

Am I just finding a confidence interval for the data in the first column?
 
  • #5
Mogarrr said:
Wait a sec...

Am I just finding a confidence interval for the data in the first column?

Yes, using probabilities estimated from the second column.
 
  • #6
Mogarrr said:
Wait a sec...

Am I just finding a confidence interval for the data in the first column?

Yes, using probabilities estimated from the second column.
 
  • #7
Ray Vickson said:
Yes, using probabilities estimated from the second column.

Well, then I'd use students's t test, but...

How I am supposed to use the probabilities? I thought I could just find [itex] \bar{x}[/itex] and [itex] s^2[/itex] from the 1st column.
 
  • #8
Mogarrr said:
Well, then I'd use students's t test, but...

How I am supposed to use the probabilities? I thought I could just find [itex] \bar{x}[/itex] and [itex] s^2[/itex] from the 1st column.

You can, if you expand it all out to get 287 sample points, like this:
X = 0,0,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3, ...,9,9,10,10,10,10. Of course there is an easier way, and that is what you need to figure out.
 
  • Like
Likes 1 person
  • #9
Ok, I think I got it now. This is like a table describing the frequency of a distribution.

Computing [itex] s^2 [/itex] might take a while.
 
  • #10
Mogarrr said:
Ok, I think I got it now. This is like a table describing the frequency of a distribution.

Computing [itex] s^2 [/itex] might take a while.

Not if you think first and calculate later.
 
  • #11
Thought about, though I already did the calculation in Excel the long way.

Expressing the rows as order pairs [itex] (a_i , b_i) [/itex], then I have...

[itex] \bar{x} = (\sum b_i \cdot a_i)/( \sum b_i) [/itex], and

[itex] s^2 = \frac 1{\sum b_i - 1} \cdot \sum b_i (a_i - \bar{x} )^2 [/itex]...

That's what I think the easy computation is.
 

FAQ: Confidence interval for a cohort

What is a confidence interval for a cohort?

A confidence interval for a cohort is a statistical tool used to estimate the range of values in which the true population mean of a specific cohort is likely to fall. It takes into account the sample size, standard deviation, and level of confidence to provide a range of values that is likely to contain the true population mean.

How is a confidence interval for a cohort calculated?

A confidence interval for a cohort is calculated using the sample mean, standard deviation, sample size, and level of confidence. The formula for calculating a confidence interval for a cohort is: sample mean ± (critical value x standard deviation / √sample size). The critical value is based on the level of confidence and is obtained from a statistical table.

What is the purpose of a confidence interval for a cohort?

The purpose of a confidence interval for a cohort is to provide a range of values that is likely to contain the true population mean. It is used to estimate the precision and accuracy of the sample mean and to determine the level of uncertainty associated with the sample data. It also helps in making inferences about the population based on the sample data.

What is the difference between a confidence interval and a margin of error?

A confidence interval and a margin of error are both used to estimate the precision and accuracy of a sample mean. However, a confidence interval provides a range of values within which the true population mean is likely to fall, while a margin of error is a single value that represents the maximum amount of error that can be expected in the sample mean. In other words, a confidence interval is a range, while a margin of error is a single number.

How do I interpret a confidence interval for a cohort?

A confidence interval for a cohort should be interpreted as follows: "We are (level of confidence)% confident that the true population mean falls within this range of values." For example, if the confidence interval is 70-80 with a level of confidence of 95%, it means that we are 95% confident that the true population mean falls within the range of 70 to 80. The wider the range, the less precise the estimate, and the lower the level of confidence, the less certain we are about the estimate.

Back
Top