Checking My Understanding of the Naive Bayes Theorem

In summary: You could try something like this: P(Rain | <Cloudy, Morning, December> ) = P(Rain) * P(Cloudy|Rain) * P(Morning|Rain) * P(Dec|Rain)This is just the probability of rain for the given conditions.
  • #1
jisbon
476
30
Homework Statement
Given the following statistics from a specific location:
60% of rainy days start out cloudy in the morning.
40% of all mornings are cloudy.
In December, it rains on average 9 out of 30 days.

In Dec, when it is a cloudy morning, what is the probability based on the given statistics that it is going to rain?
Relevant Equations
P(A|B) = P(B|A) P(A) / P(B)
I would like to check my understanding here to see if it is correct as I am currently stuck at the moment.
From the question, I can gather that:
P(Rain | Dec) = 9/30
P(Cloudy | Rain) = 0.6?
P(Cloudy | Rain) = 0.4

To answer the question:
P(Rain | <Cloudy, Morning, December> ) = P(Rain) * P(Cloudy|Rain) * P(Morning|Rain) * P(Dec|Rain)
= ? * 0.6 * ? * 9/30

This is going towards the Naive Bayes Theorem though, right? I think my initial thought process may be already wrong. Any guidance is greatly appreciated. Thank you!
 
Physics news on Phys.org
  • #2
First, we have to assume that the data applies to December. If the data about cloudy and non-cloudy days is different in December from the annual average, then we have too little data to go on.

In that sense, it is just a straight calculation for December.
jisbon said:
Homework Statement:: Given the following statistics from a specific location:
60% of rainy days start out cloudy in the morning.
40% of all mornings are cloudy.
In December, it rains on average 9 out of 30 days.

In Dec, when it is a cloudy morning, what is the probability based on the given statistics that it is going to rain?
Relevant Equations:: P(A|B) = P(B|A) P(A) / P(B)
I may have advised this before, but if you use the raw equations, then you risk getting lost in a sea of conditional probabilities. You should try a probability tree for these problems.
jisbon said:
I would like to check my understanding here to see if it is correct as I am currently stuck at the moment.
From the question, I can gather that:
P(Rain | Dec) = 9/30
This is just the probability of rain for the problem. You don't need the conditional probability for December, as it is assumed across all calculations that we are in December.
jisbon said:
P(Cloudy | Rain) = 0.6?
Correct. That's what "60% of rainy days start out cloudy means".
jisbon said:
P(Cloudy | Rain) = 0.4
No. ##0.4## is the probability that any given day is cloudy in the morning.
jisbon said:
To answer the question:
P(Rain | <Cloudy, Morning, December> ) = P(Rain) * P(Cloudy|Rain) * P(Morning|Rain) * P(Dec|Rain)
= ? * 0.6 * ? * 9/30
You're just lost now trying to use these horrible equations, rather than a nice probability tree!
 
  • Like
Likes WWGD, Steve4Physics and FactChecker

FAQ: Checking My Understanding of the Naive Bayes Theorem

What is the Naive Bayes Theorem?

The Naive Bayes Theorem is a statistical algorithm used for classification and prediction tasks. It is based on the principle of conditional probability, where the probability of an event occurring is calculated based on prior knowledge of related events.

How does the Naive Bayes Theorem work?

The Naive Bayes Theorem works by calculating the probability of a given event occurring based on prior knowledge of related events. It uses the Bayes' rule to calculate the conditional probability and assumes that all features are independent of each other.

What are the assumptions made by the Naive Bayes Theorem?

The Naive Bayes Theorem assumes that all features are independent of each other and that the presence of one feature does not affect the presence of another. It also assumes that the features are normally distributed and that all features are equally important in predicting the outcome.

What are the advantages of using the Naive Bayes Theorem?

The Naive Bayes Theorem is a simple and easy-to-understand algorithm that works well with large datasets. It also performs well in situations where the assumptions of independence and normal distribution hold true. Additionally, it is not affected by irrelevant features and can handle missing data.

What are the limitations of the Naive Bayes Theorem?

The Naive Bayes Theorem assumes that all features are independent, which may not always hold true in real-world scenarios. It also assumes that all features are equally important, which may not be the case. Additionally, it cannot handle complex relationships between features and may perform poorly if the dataset is imbalanced.

Back
Top