Shannon entropy

In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent in the variable's possible outcomes. The concept of information entropy was introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication", and is sometimes called Shannon entropy in his honour. As an example, consider a biased coin with probability p of landing on heads and probability 1 − p of landing on tails. The maximum surprise is for p = 1/2, when there is no reason to expect one outcome over another, and in this case a coin flip has an entropy of one bit. The minimum surprise is when p = 0 or p = 1, when the event is known and the entropy is zero bits. Other values of p give different entropies between zero and one bits.
Given a discrete random variable



X


{\displaystyle X}
, with possible outcomes




x

1


,
.
.
.
,

x

n




{\displaystyle x_{1},...,x_{n}}
, which occur with probability




P

(

x

1


)
,
.
.
.
,

P

(

x

n


)
,


{\displaystyle \mathrm {P} (x_{1}),...,\mathrm {P} (x_{n}),}
the entropy of



X


{\displaystyle X}
is formally defined as:

where



Σ


{\displaystyle \Sigma }
denotes the sum over the variable's possible values and



log


{\displaystyle \log }
is the logarithm, the choice of base varying between different applications. Base 2 gives the unit of bits (or "shannons"), while base e gives the "natural units" nat, and base 10 gives a unit called "dits", "bans", or "hartleys". An equivalent definition of entropy is the expected value of the self-information of a variable.The entropy was originally created by Shannon as part of his theory of communication, in which a data communication system is composed of three elements: a source of data, a communication channel, and a receiver. In Shannon's theory, the "fundamental problem of communication" – as expressed by Shannon – is for the receiver to be able to identify what data was generated by the source, based on the signal it receives through the channel. Shannon considered various ways to encode, compress, and transmit messages from a data source, and proved in his famous source coding theorem that the entropy represents an absolute mathematical limit on how well data from the source can be losslessly compressed onto a perfectly noiseless channel. Shannon strengthened this result considerably for noisy channels in his noisy-channel coding theorem.
Entropy in information theory is directly analogous to the entropy in statistical thermodynamics. The analogy results when the values of the random variable designate energies of microstates, so Gibbs formula for the entropy is formally identical to Shannon's formula. Entropy has relevance to other areas of mathematics such as combinatorics. The definition can be derived from a set of axioms establishing that entropy should be a measure of how "surprising" the average outcome of a variable is. For a continuous random variable, differential entropy is analogous to entropy.

View More On Wikipedia.org
Back
Top