Bayes' theorem and disease prevalence

In summary, the conversation discusses a tampon diagnostic test with a 1% positive result rate and a positive predictive value of 0.95 and negative predictive value of 0.98. The conversation also mentions calculating the prevalence and sensitivity and specificity of the test. Bayes' theorem is mentioned as a potential calculation method, but the conversation ultimately leads to a discussion on the relationship between positive and negative predictive values and sensitivity and specificity.
  • #1
BRN
108
10
Homework Statement
Claculate prevalence, sensitivity and specificity of the diagnostic test.
Relevant Equations
Bayes theorem
Hello at all!

I have to solve this exercise:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive disease with disease) and specificity (negative test probability with disease absence) of the test?
With positive test (T+), negative test (T-), disease (D) and healthy (D-), I have this table:
DD-
T+P(T+|D)P(T+|D-)
T-P(T-|D)P(T-|D-)

## P(D|T+) = 0.95 ##, ## P(D-|T-) = 0.98 ## and ## P(T+) = 0.001 ##.

From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
 
Physics news on Phys.org
  • #2
BRN said:
The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

But how can I calculate the prevalence ## P(D) ##?
I hope I understand the terminology here. I assume this means that if someone has the disease then there is a 0.95 probability of a positive result (and hence a 0.05 probability of a false negative). And, if someone does not have the disease, then there is a 0.98 probability of a negative results (and a 0.02 probability of a false positive). And, I guess the prevalence means how many people have the disease.

A tampon diagnostic test provides 1% positive results.

What you could do is assume that the prevalence is ##P(D) = p##, calculate the positive results (as a function of ##p##) and equate this to 1%.

PS Although that makes no sense, as with a false positive of 0.02, you must get at least 2% positive tests, even if no one has the disease. Perhaps I've misunderstood the terminology?

PPS I did misunderstand!
 
Last edited:
  • #3
Your post seems a bit garbled. Do you mean
BRN said:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of disease given positive test ) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive test given disease) and specificity (negative test probability with disease absence) of the test?
BRN said:
From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
So it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
 
  • #4
haruspex said:
Your post seems a bit garbled. Do you meanSo it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
 
  • #5
PeroK said:
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
No, it's a false negative rate of 2%. P(absence of disease given negative test)=0.98.
 
  • Like
Likes PeroK
  • #6
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.
 
  • Like
Likes BRN
  • #7
Thanks for your help.

PeroK said:
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.

OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

PeroK said:
What you could do is assume that the prevalence is , calculate the positive results (as a function of ) and equate this to 1%.

I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
 
  • #8
BRN said:
Thanks for your help.
OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity
I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Please try to answer my question in post #3.
 
  • #9
BRN said:
So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99
That's doesn't look right. You must have ##P(D|T+) + P(D-|T+) = 1## etc.

BRN said:
and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Okay, so you're going round in circles perhaps.

I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem (although it's the same information).

Using the tree method, the prevalance (##D_+##) falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.
 

Attachments

  • thumbnail_20210312_081259.jpg
    thumbnail_20210312_081259.jpg
    43 KB · Views: 117
Last edited:
  • #10
PeroK said:
I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem
Finding P(D) from the given data is very simple and does not require Bayes' Theorem. See post #3,
 
  • #11
PeroK said:
Using the tree method, the prevalance () falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.

## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

PeroK said:
That's doesn't look right. You must have etc.

Yes! I agree. I made a mistake...

haruspex said:
What is the relationship between P(A), P(A|B) and P(A|~B)?

But isn't there a single relationship, or am I wrong?

$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$

I'm sorry, but I'm a beginner...
 
  • #12
BRN said:
## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

I'm sorry, but I'm a beginner...
Yes: $$P(T+) (PPV) + P(T+) (1-PPV) = P(T+)$$
I'm going to suggest you learn the probability tree method. These numbers are all related to each other and the best way to see this is a simple probability tree. The answers drop out (like ripe fruit, as it were!).

Did you understand the diagram I posted above?
 
  • #13
BRN said:
But isn't there a single relationship, or am I wrong?
$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$
Sorry, the way I expressed it wasn't very clear. It's easier via joint probabilities:
Can you express P(A) in terms of P(A∩B) and P(A∩~B)?
Then P(A∩B) in terms of P(A|B) and P(B) etc?

Wrt post #1, it probably would have been more helpful to have drawn a table of joint probabilities rather than of conditional probabilities. You could fill in four unknowns for these, a, b, c, d, then write expressions for the given data in terms of them. You would have got four equations.
 
Last edited:

FAQ: Bayes' theorem and disease prevalence

What is Bayes' theorem and how is it related to disease prevalence?

Bayes' theorem is a mathematical formula that allows us to calculate the probability of an event occurring based on prior knowledge or information. It is related to disease prevalence because it can help us estimate the likelihood of a person having a disease based on their test results and the prevalence of the disease in the population.

How is Bayes' theorem used in medical research?

Bayes' theorem is commonly used in medical research to assess the accuracy of diagnostic tests and to estimate the probability of a patient having a certain disease based on their test results. It can also be used to evaluate the effectiveness of treatments and interventions.

What are the key components of Bayes' theorem?

The key components of Bayes' theorem are the prior probability, the likelihood ratio, and the posterior probability. The prior probability is the initial belief or probability of an event occurring. The likelihood ratio is the ratio of the probability of a particular test result in individuals with the disease to the probability of the same test result in individuals without the disease. The posterior probability is the updated probability of an event occurring after taking into account new information.

How does disease prevalence affect the accuracy of a diagnostic test?

The accuracy of a diagnostic test is affected by disease prevalence because the prevalence of a disease in a population directly impacts the likelihood of a positive or negative test result. In a population with a high disease prevalence, a positive test result is more likely to be a true positive, while in a population with a low disease prevalence, a positive test result is more likely to be a false positive.

What are some limitations of using Bayes' theorem in medical research?

One limitation of using Bayes' theorem in medical research is that it relies on accurate and unbiased prior probabilities. If these probabilities are incorrect or biased, it can lead to inaccurate results. Additionally, Bayes' theorem assumes that the prior probability and the likelihood ratio are independent, which may not always be the case in real-world scenarios. Lastly, Bayes' theorem can be complex and difficult to understand for individuals without a strong background in mathematics and statistics.

Similar threads

Replies
6
Views
3K
Replies
19
Views
2K
Replies
2
Views
5K
Replies
1
Views
1K
Replies
47
Views
4K
Replies
10
Views
3K
Back
Top