- #1
dspampi
- 16
- 0
In this problem, assume researchers have constructed a risk score for HIV for the U.S. population, which is a function of risk factors such as frequency of unprotected sex, use of intravenous drugs, having another sexually trans- mitted infection, etc. Assume each risk factor measured is discrete valued. The risk score r for an individual is defined as the fraction who are HIV positive among those in the U.S. with exactly the same risk factors as this individual. (In practice the risk score will have to be estimated, but here we assume it is known). Let R ⊆ [0, 1] denote the set of possible risk scores. (R may not be the entire interval [0, 1] if for some values of r, no individual in the U.S. has risk score r.)
We consider the ELISA HIV test. We assume this test has known sensitivity denoted by s1 and known specificity denoted by s2, neither of which depend on the risk score. That is, for any r, if we consider the population of those with risk score r, the test’s sensitivity and specificity among this population are s1 and s2, respectively, where s1,s2 do not depend on r. We assume the number of HIV positive individuals in the U.S. is 1.2 million, and the total U.S. population is 310 million.
Note: the sensitivity of a test is the probability the outcome of the test is positive given that the person tested has HIV; the specificity of a test is the probability the outcome of the test is negative given that the person tested does not have HIV.
(a) For each r ∈ R, let p(r) denote the prevalence of HIV infection among those in the U.S. population with risk score r. Write an expression for the function p(r) as a function of r. (Hint: it’s a very simple function of r.)
(b) Assume an individual in the U.S. is selected at random, and has risk score r. The individual is given the ELISA HIV test, and tests positive. Given just this information, what is the probability that the individual is HIV infected? (This is the positive predictive value of the test, within the population of individuals with risk score r.) Your answer should be an expression involving p(r),s1,s2 (or could just involve r,s1,s2 if you plug in the answer from (a) for p(r)).
There are other parts to the problem but I want to see what I've got so far:
For (a) I'm thinking since we are looking at prevalence in terms of p(r) that it would be the percentage of (people HIV+/ total people)* r
(b) Since the person has a risk value of r,
and given that their test score came back +, we want know the probability that he is actually HIV+; so I think it's:
P(P+|T+) = S1*p(r)/[S1*p(r) + (1-S2)(1-p(r)] ?
We consider the ELISA HIV test. We assume this test has known sensitivity denoted by s1 and known specificity denoted by s2, neither of which depend on the risk score. That is, for any r, if we consider the population of those with risk score r, the test’s sensitivity and specificity among this population are s1 and s2, respectively, where s1,s2 do not depend on r. We assume the number of HIV positive individuals in the U.S. is 1.2 million, and the total U.S. population is 310 million.
Note: the sensitivity of a test is the probability the outcome of the test is positive given that the person tested has HIV; the specificity of a test is the probability the outcome of the test is negative given that the person tested does not have HIV.
(a) For each r ∈ R, let p(r) denote the prevalence of HIV infection among those in the U.S. population with risk score r. Write an expression for the function p(r) as a function of r. (Hint: it’s a very simple function of r.)
(b) Assume an individual in the U.S. is selected at random, and has risk score r. The individual is given the ELISA HIV test, and tests positive. Given just this information, what is the probability that the individual is HIV infected? (This is the positive predictive value of the test, within the population of individuals with risk score r.) Your answer should be an expression involving p(r),s1,s2 (or could just involve r,s1,s2 if you plug in the answer from (a) for p(r)).
There are other parts to the problem but I want to see what I've got so far:
For (a) I'm thinking since we are looking at prevalence in terms of p(r) that it would be the percentage of (people HIV+/ total people)* r
(b) Since the person has a risk value of r,
and given that their test score came back +, we want know the probability that he is actually HIV+; so I think it's:
P(P+|T+) = S1*p(r)/[S1*p(r) + (1-S2)(1-p(r)] ?