Understanding the transformation of skewness formula

In summary, the skewness formula uses a corrected term to account for a bias in small samples, and the skewness asymptotic behavior is similar to the mean cubed deviation.
  • #1
Vital
108
4
Hello.

Here is the formula that computes the sample skewness:

Sk = [ (n / [(n-1)(n-2)] ] x [ ∑ (Xi - X)3 / s3 ] ,
where n is the number of elements in the sample,
Xi is the specific i-th value of the sample, where i starts from 0 and ends at i=n,
X is the arithmetic mean, s - standard deviation

I have two questions about this formula:

1) It is said in the book that the term n/[(n − 1)(n − 2)] in the above equation corrects for a downward bias in small samples. What does it mean and how that correction happens? For example, if n = 5, then n/[(n − 1)(n − 2)] = 0.4167.

I see it as if by using this part of equation we are taking around only 42 percent of the second part of the formula [ ∑ (Xi - X)3 / s3 ]. How does that help to correct for downward bias?

2) Also in the book it is said that as n becomes large, the expression reduces to the mean cubed deviation: Sk ≈ [ (1 / n ] x [ SUM (Xi - X)3 / s3 ]
How does this happen mathematically? I don't see it. For example, n = 1000, then

1000 / ( 999 x 998) how does this turn into 1/n?

Thank you very much.
 
Physics news on Phys.org
  • #2
This post really needs Latex. Consider using:

https://www.physicsforums.com/help/latexhelp/
and also

https://www.codecogs.com/latex/eqneditor.php
- - - -
Let's work backward:

Question 2:

are you familiar with the fact that
##\lim_{n \to \infty} \frac{n}{n-2} = 1##

or as people say
## \frac{n}{n-2} \approx 1##

for large enough ##n##.
- - - -
now consider

##\frac{1}{n-1} \approx \frac{1}{n}##

for large enough ##n## (why?)

putting these together

## \frac{n}{(n-1)(n-2)} = \big(\frac{n}{(n-2)}\big)\big(\frac{1}{n-1}\big) \approx \big(1\big)\big( \frac{1}{n}\big)= \frac{1}{n}##

for large enough ##n## -- i.e. you can consider the asymptotic behavior separately. (If you prefer: take advantage of positivity and consider the limiting effects while working in logspace and then exponentiate your way back at the end.)
- - - -
Outside the scope thought: the rate of convergence here is actually pretty good. In your example consider ##\frac{1}{1000} \text{vs} \frac{1000}{(999)(998)}## they are actually pretty close to each other -- i.e. the first is ##0.001000## and the second one is ##0.001003## --where I rounded to the 6th decimal place.

There are certain results that are asymptotically correct but require exponentially more data to be a valid approximation -- these are things to be suspicious of -- but they don't apply here because convergence is pretty good.
- - - -
Question 1:

Are you familiar with the ##(n-1)## correction used in calculating sample variance? I think you should know this inside and out before considering sample skewness -- (a) because it is simpler and (b) because variance is much more important than skew -- in particular for the Central Limit Theorem, but also because estimates get less and less reliable the further up the moment curve that you go when you have noisy data.

The corrections for skew are the same logic, just one moment higher -- i.e. if you look at the moments involved, it's 3rd moment, 1st moment and 2nd moment. But the first and second moment are not 'pure' -- they are each sample moments and hence there's a data point / degree of freedom being chewed up -- the ## \frac{n}{(n-1)(n-2)}## corrects for that. I'm sure you can dig around and find the exact derivation for these skew corrections, but I don't think it's going to be that insightful. And more to the point: as noted in Question 2, asymptotically these corrections don't matter. Put differently: if you are dealing with small amounts of data, you need to pay attention to this stuff. But if you are dealing with medium or big data, it really doesn't matter.
 
Last edited:
  • #3
The (n-1)(n-2) term corrects for the fact that the expression uses the sample mean, rather than the true mean. Another way of looking at is that the mathematical expectation of the expression using (n-1)(n-2) is the theoretical value for the given distribution.
 
  • #4
StoneTemplePython said:
...

Thank you very much for this truly helpful post. I am sorry I didn't reply earlier. Now I understand the concept, and yet, I wasn't familiar with corrections.
Thank you once again.
 

FAQ: Understanding the transformation of skewness formula

What is skewness and why is it important?

Skewness is a measure of the asymmetry of a distribution. It indicates whether the data is concentrated more on one side of the mean or the other. Understanding skewness is important because it helps us to better interpret and analyze data.

How is skewness calculated?

The most commonly used formula for calculating skewness is the Pearson's moment coefficient of skewness. It involves using the mean, median, and standard deviation of a dataset to determine the degree of asymmetry.

What does a positive or negative skewness value indicate?

A positive skewness value indicates that the data is skewed to the right, meaning that there is a longer tail on the right side of the distribution. A negative skewness value indicates that the data is skewed to the left, with a longer tail on the left side.

Can skewness be used to determine the shape of a distribution?

Yes, the value of skewness can give an idea of the shape of a distribution. A skewness value of 0 indicates a symmetric distribution, while values greater than or less than 0 indicate asymmetry.

What are some limitations of using skewness?

Skewness is just one measure of the shape of a distribution and should not be used as the sole indicator. It does not provide information about the specific location of the data points and can be influenced by outliers. Additionally, it may not accurately describe the shape of a distribution if the data is multimodal.

Similar threads

Replies
39
Views
1K
Replies
1
Views
1K
Replies
4
Views
1K
Replies
15
Views
2K
Replies
3
Views
1K
Replies
24
Views
4K
Replies
7
Views
3K
Back
Top