Bayes' theorem and disease prevalence

BRN · Mar 10, 2021

Hello at all!

I have to solve this exercise:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

What is the prevalence of the disease?
What are the sensitivity (probabilities of positive disease with disease) and specificity (negative test probability with disease absence) of the test?

With positive test (T+), negative test (T-), disease (D) and healthy (D-), I have this table:

	D	D-
T+	P(T+\|D)	P(T+\|D-)
T-	P(T-\|D)	P(T-\|D-)

## P(D|T+) = 0.95 ##, ## P(D-|T-) = 0.98 ## and ## P(T+) = 0.001 ##.

From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?

PeroK · Mar 11, 2021

BRN said:

The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

But how can I calculate the prevalence ## P(D) ##?

I hope I understand the terminology here. I assume this means that if someone has the disease then there is a 0.95 probability of a positive result (and hence a 0.05 probability of a false negative). And, if someone does not have the disease, then there is a 0.98 probability of a negative results (and a 0.02 probability of a false positive). And, I guess the prevalence means how many people have the disease.

A tampon diagnostic test provides 1% positive results.

What you could do is assume that the prevalence is ##P(D) = p##, calculate the positive results (as a function of ##p##) and equate this to 1%.

PS Although that makes no sense, as with a false positive of 0.02, you must get at least 2% positive tests, even if no one has the disease. Perhaps I've misunderstood the terminology?

PPS I did misunderstand!

haruspex · Mar 11, 2021

Your post seems a bit garbled. Do you mean

BRN said:

A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of disease given positive test ) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

What is the prevalence of the disease?

What are the sensitivity (probabilities of positive test given disease) and specificity (negative test probability with disease absence) of the test?

BRN said:

From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?

So it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?

PeroK · Mar 11, 2021

haruspex said:

Your post seems a bit garbled. Do you meanSo it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?

How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.

haruspex · Mar 11, 2021

PeroK said:

How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.

No, it's a false negative rate of 2%. P(absence of disease given negative test)=0.98.

PeroK · Mar 11, 2021

This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.

BRN · Mar 11, 2021

Thanks for your help.

PeroK said:

This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.

PeroK said:

What you could do is assume that the prevalence is , calculate the positive results (as a function of ) and equate this to 1%.

I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...

haruspex · Mar 11, 2021

BRN said:

Thanks for your help.
OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity
I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...

Please try to answer my question in post #3.

PeroK · Mar 12, 2021

BRN said:

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

That's doesn't look right. You must have ##P(D|T+) + P(D-|T+) = 1## etc.

BRN said:

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...

Okay, so you're going round in circles perhaps.

I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem (although it's the same information).

Using the tree method, the prevalance (##D_+##) falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.

haruspex · Mar 12, 2021

PeroK said:

I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem

Finding P(D) from the given data is very simple and does not require Bayes' Theorem. See post #3,

BRN · Mar 13, 2021

PeroK said:

Using the tree method, the prevalance () falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.

## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

PeroK said:

That's doesn't look right. You must have etc.

Yes! I agree. I made a mistake...

haruspex said:

What is the relationship between P(A), P(A|B) and P(A|~B)?

But isn't there a single relationship, or am I wrong?

$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$

I'm sorry, but I'm a beginner...

PeroK · Mar 13, 2021

BRN said:

## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

I'm sorry, but I'm a beginner...

Yes: $$P(T+) (PPV) + P(T+) (1-PPV) = P(T+)$$
I'm going to suggest you learn the probability tree method. These numbers are all related to each other and the best way to see this is a simple probability tree. The answers drop out (like ripe fruit, as it were!).

Did you understand the diagram I posted above?

haruspex · Mar 13, 2021

BRN said:

But isn't there a single relationship, or am I wrong?
$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$

Sorry, the way I expressed it wasn't very clear. It's easier via joint probabilities:
Can you express P(A) in terms of P(A∩B) and P(A∩~B)?
Then P(A∩B) in terms of P(A|B) and P(B) etc?

Wrt post #1, it probably would have been more helpful to have drawn a table of joint probabilities rather than of conditional probabilities. You could fill in four unknowns for these, a, b, c, d, then write expressions for the given data in terms of them. You would have got four equations.

Bayes' theorem and disease prevalence

Attachments

Similar threads

Hot Threads

Recent Insights