Bayes' theorem and disease prevalence

AI Thread Summary
The discussion revolves around calculating the prevalence of a disease using Bayes' theorem and understanding the sensitivity and specificity of a diagnostic test. A tampon diagnostic test yields a 1% positive result rate, with a positive predictive value of 0.95 and a negative predictive value of 0.98. Participants clarify the definitions of sensitivity and specificity, emphasizing their relationship to false positives and false negatives. The conversation highlights the confusion around calculating prevalence and suggests using a probability tree for clearer insights. Ultimately, the participants agree on the importance of correctly interpreting the relationships between the probabilities involved.
BRN
Messages
107
Reaction score
10
Homework Statement
Claculate prevalence, sensitivity and specificity of the diagnostic test.
Relevant Equations
Bayes theorem
Hello at all!

I have to solve this exercise:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive disease with disease) and specificity (negative test probability with disease absence) of the test?
With positive test (T+), negative test (T-), disease (D) and healthy (D-), I have this table:
DD-
T+P(T+|D)P(T+|D-)
T-P(T-|D)P(T-|D-)

## P(D|T+) = 0.95 ##, ## P(D-|T-) = 0.98 ## and ## P(T+) = 0.001 ##.

From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
 
Physics news on Phys.org
BRN said:
The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

But how can I calculate the prevalence ## P(D) ##?
I hope I understand the terminology here. I assume this means that if someone has the disease then there is a 0.95 probability of a positive result (and hence a 0.05 probability of a false negative). And, if someone does not have the disease, then there is a 0.98 probability of a negative results (and a 0.02 probability of a false positive). And, I guess the prevalence means how many people have the disease.

A tampon diagnostic test provides 1% positive results.

What you could do is assume that the prevalence is ##P(D) = p##, calculate the positive results (as a function of ##p##) and equate this to 1%.

PS Although that makes no sense, as with a false positive of 0.02, you must get at least 2% positive tests, even if no one has the disease. Perhaps I've misunderstood the terminology?

PPS I did misunderstand!
 
Last edited:
Your post seems a bit garbled. Do you mean
BRN said:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of disease given positive test ) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive test given disease) and specificity (negative test probability with disease absence) of the test?
BRN said:
From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
So it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
 
haruspex said:
Your post seems a bit garbled. Do you meanSo it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
 
PeroK said:
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
No, it's a false negative rate of 2%. P(absence of disease given negative test)=0.98.
 
  • Like
Likes PeroK
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.
 
  • Like
Likes BRN
Thanks for your help.

PeroK said:
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.

OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

PeroK said:
What you could do is assume that the prevalence is , calculate the positive results (as a function of ) and equate this to 1%.

I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
 
BRN said:
Thanks for your help.
OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity
I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Please try to answer my question in post #3.
 
BRN said:
So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99
That's doesn't look right. You must have ##P(D|T+) + P(D-|T+) = 1## etc.

BRN said:
and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Okay, so you're going round in circles perhaps.

I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem (although it's the same information).

Using the tree method, the prevalance (##D_+##) falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.
 

Attachments

  • thumbnail_20210312_081259.jpg
    thumbnail_20210312_081259.jpg
    43 KB · Views: 160
Last edited:
  • #10
PeroK said:
I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem
Finding P(D) from the given data is very simple and does not require Bayes' Theorem. See post #3,
 
  • #11
PeroK said:
Using the tree method, the prevalance () falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.

## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

PeroK said:
That's doesn't look right. You must have etc.

Yes! I agree. I made a mistake...

haruspex said:
What is the relationship between P(A), P(A|B) and P(A|~B)?

But isn't there a single relationship, or am I wrong?

$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$

I'm sorry, but I'm a beginner...
 
  • #12
BRN said:
## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

I'm sorry, but I'm a beginner...
Yes: $$P(T+) (PPV) + P(T+) (1-PPV) = P(T+)$$
I'm going to suggest you learn the probability tree method. These numbers are all related to each other and the best way to see this is a simple probability tree. The answers drop out (like ripe fruit, as it were!).

Did you understand the diagram I posted above?
 
  • #13
BRN said:
But isn't there a single relationship, or am I wrong?
$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$
Sorry, the way I expressed it wasn't very clear. It's easier via joint probabilities:
Can you express P(A) in terms of P(A∩B) and P(A∩~B)?
Then P(A∩B) in terms of P(A|B) and P(B) etc?

Wrt post #1, it probably would have been more helpful to have drawn a table of joint probabilities rather than of conditional probabilities. You could fill in four unknowns for these, a, b, c, d, then write expressions for the given data in terms of them. You would have got four equations.
 
Last edited:

Similar threads

Replies
6
Views
3K
Replies
19
Views
2K
Replies
2
Views
5K
Replies
47
Views
4K
Replies
10
Views
3K
Replies
4
Views
1K
Back
Top