Newbie question: Algebra of Mahalanobis distance

anja.ende · Nov 11, 2013

Hello,

The Mahalanobis distance or rather its square is defined as :

[itex](X-\mu)^2/\Sigma[/itex] which is then written as

[itex](X-\mu)^{T} Ʃ^{-1}(X-\mu)[/itex]

Ʃ is the covariance matrix. My silly question is why is the sigma placed in the middle of the dot product of the (X-μ) vector with itself. I am sure this makes sense mathematically (this reduces the output to a scalar) but I would like to know the intuitive reason behind it.

Thanks a lot!
Anja

Office_Shredder · Nov 11, 2013

The idea behind the Mahalanobis distance is that you are measuring how many standard deviations from the mean X is in the one dimensional case. In multidimensional cases, [itex] \Sigma[/itex] is going to be a positive (semi)definite matrix, which will have a unique positive (semi)definite square root which I will call S. S serves the same role as the standard deviation. Then the expression above is the same as

[tex] \left( S^{-1}(X-\mu) \right)^T \left(S^{-1}(X-\mu) \right) [/tex]

basically, you scale the random vector [itex] X-\mu[/itex] by the standard deviation, the same as you would in the one dimensional case.

D H · Nov 11, 2013

anja.ende said:

[itex](X-\mu)^{T} Ʃ^{-1}(X-\mu)[/itex]

Ʃ is the covariance matrix. My silly question is why is the sigma placed in the middle of the dot product of the (X-μ) vector with itself. I am sure this makes sense mathematically (this reduces the output to a scalar) but I would like to know the intuitive reason behind it.

The expression ##(X-\mu)^T \Sigma^{-1}(X-\mu) = \sigma^2## defines a family of hyperellipsoids in the N-dimensional space in which X and μ live, characterized by the scalar parameter σ. I used σ intentionally. Think of σ as representing "standard deviations". For example, ##(X-\mu)^T \Sigma^{-1}(X-\mu) = 1## is the one sigma hyperellipsoid.

The Mahalanobis distance is essentially a measure of how many standard deviations a point X is from the mean μ.

anja.ende · Nov 11, 2013

Thank you guys!

blue_raver22 · Nov 18, 2013

Hello Anja,

Thank you for your question. The placement of the covariance matrix (Σ) in the Mahalanobis distance formula is not arbitrary and has a mathematical reasoning behind it. To understand this, let's first review what the Mahalanobis distance measures.

The Mahalanobis distance is a measure of how different two data points are from each other, taking into account the covariance between variables. In other words, it takes into consideration the correlation and scale of each variable, rather than simply looking at the distance between the points in each individual variable. This is especially useful when dealing with high-dimensional data, where the variables may be correlated with each other.

Now, let's look at the formula (X-μ)Ʃ^{-1}(X-μ). The first part, (X-μ), represents the difference between the two data points in each variable. The second part, Ʃ^{-1}, is the inverse of the covariance matrix, which essentially adjusts for the correlation between variables. By multiplying (X-μ) with Ʃ^{-1}, we are essentially normalizing the differences in each variable and taking into account their correlation. Finally, by multiplying the result with (X-μ)^{T}, we are taking the dot product of the normalized differences, resulting in a scalar value that represents the distance between the two data points.

In summary, the placement of Σ in the formula is crucial in taking into account the covariance between variables and producing a meaningful measure of distance between two data points. I hope this helps to clarify the reasoning behind it.

Best,

Newbie question: Algebra of Mahalanobis distance

Related to Newbie question: Algebra of Mahalanobis distance

1. What is Mahalanobis distance?

2. How is Mahalanobis distance calculated?

3. What is the significance of Mahalanobis distance in statistics?

4. Can Mahalanobis distance be used for outliers detection?

5. Is Mahalanobis distance affected by the number of variables in a data set?

Similar threads

Hot Threads

Recent Insights