Relative Entropy or Kullback Leibler divergence

In summary, the task at hand is to calculate the relative entropy between two sets of data, specifically their frequency of occurrence matrices. Set 2 is a matrix created after a variable number of characters is mutated, while Set 1 serves as the base set. The goal is to understand the process of calculating relative entropy, but online resources have been difficult to understand due to a lack of defined variables.
  • #1
bowlbase
146
2

Homework Statement


I am suppose to calculate the relative entropy between two sets of data:
Base set
Set 1:
A C G T
0 0 0 10
0 0 0 10
0 0 10 0
0 10 0 0
10 0 0 0
* * * * //Randomized
0 0 0 10
0 10 0 0

Set 2:
A C G T
0 0 0 10
0 0 0 10
0 0 10 0
0 10 0 0
10 0 0 0
1 4 1 4
0 0 0 10
0 10 0 0These are frequency of occurrence matrices. Set 2 is a matrix created after a variable number of characters is mutated. In this case only 1 character in the 3rd from bottom row was mutated. Thats why this row has no 10s. Every other position didn't mutate so has the correct number of occurrences as compared to set 1. I have 70 other sets of this data with various number of mutations and lengths.

I am trying to read about this online but the information is convoluted and often seems to actively avoid defining variables. Can someone walk me through the process?

Homework Equations

The Attempt at a Solution

 
Physics news on Phys.org
  • #2
Nevermind, I've got it!
 

FAQ: Relative Entropy or Kullback Leibler divergence

What is Relative Entropy or Kullback Leibler divergence?

Relative Entropy or Kullback Leibler divergence is a measure of the difference between two probability distributions. It measures how much information is lost when one distribution is used to approximate another.

How is Relative Entropy calculated?

Relative Entropy is calculated by taking the sum of the probabilities in one distribution multiplied by the logarithm of the ratio between the probabilities of the same event in the two distributions.

What is the significance of Relative Entropy in statistics?

Relative Entropy is an important concept in statistics as it allows for the comparison of two probability distributions and can be used to assess the accuracy of a model or prediction. It is also used in machine learning and information theory.

What are the applications of Relative Entropy?

Relative Entropy has various applications in fields such as statistics, machine learning, and information theory. It is commonly used in data compression, model selection, and hypothesis testing.

Are there any limitations to using Relative Entropy?

One limitation of Relative Entropy is that it is not a symmetric measure, meaning that the relative entropy between two distributions is not always the same as the relative entropy between the same distributions in reverse order. Additionally, it can be affected by the choice of reference distribution and may not be suitable for all types of data distributions.

Similar threads

Replies
3
Views
2K
Replies
7
Views
1K
Replies
6
Views
4K
Replies
3
Views
2K
Replies
1
Views
1K
Replies
3
Views
2K
Back
Top