Order Statistics Definition: Explained

  • Thread starter kingwinner
  • Start date
  • Tags
    Statistics
In summary: Would the probability be 1/6 or 0? If it's 1/6, then why is the equality sign important? If x is a continuous random variable, then the probability that x is in a given interval [a, b] is \int_a^b f(x)dx where f(x) is the probability distribution. If a= b that is 0: the probabilty x is equal to a specific number is 0. if x1 and x2 are two independent continuous random variables, then the probability x1= x2 is 0.3. But for P(X1=X3), X3 is not fixed (it's a variable
  • #1
kingwinner
1,270
0
Definition: Consider a set of random variables X1,X2,...,Xn, the "order statistics" are the same random variables arranged in increasing (actually nondecreasing) order, i.e. X(1)<X(2)<...<X(n)

After reading a few more paragraphs, I immediately ran into deep confusion because I cannot make any sense of it...

My confusions:

1. X1,X2,...,Xn are random VARIABLES (not constants), so you don't know exactly what the outcome will be. For example the set of possible values of each Xi may be the interval [0,1]. How can you possibly order something that you can't even tell the exact value? (this is like saying [0,1] is less than [0,1]) This makes no intuitive sense to me...


2. "Distributions of Xi's and X(i)'s are NOT the same."
How can this be true? They are just the SAME random variables ordered in a different way, how can they distributions possibly be different? I can't make sense of this either...


3. "If the random variables X1,X2,...,Xn are continuous, the equality signs in X(1)<X(2)<...<X(n) can be ignored."
Now...I can't see WHY?


Could someone please explain? I have read the related material in 2 different textbooks, but they are saying pretty much the same thing and do not clear my doubts at all.

Any help would be appreciated!
 
Physics news on Phys.org
  • #2
If x is a continuous random variable then the probability that x is in a given interval [a, b] is [itex]\int_a^b f(x)dx[/itex] where f(x) is the probability distribution. If a= b that is 0: the probabilty x is equal to a specific number is 0. if x1 and x2 are two independent continuous random variables, then the probability x1= x2 is 0.
 
  • #3
3. But for P(X1=X3), X3 is not fixed (it's a variable), so why would it still have zero probability?
 
  • #4
kingwinner said:
3. But for P(X1=X3), X3 is not fixed (it's a variable), so why would it still have zero probability?

In continuous intervals, the probability of a random variable being exactly equal to a certain value is zero. Here is a bad way to think about it: If X is a random number in the continuous interval [0,1], the probability X=1 is exactly 1 divided by the number of real numbers in [0,1], which is infinite. So 1/infinity = 0. I know, division by infinity is undefined but I'm cheating to give you an intuitive glimpse.

Another way to think about this is if I ask you to cut me a piece of string that is exactly 12 inches. You would probably cut me a string that is 11.9999 inches or 12.00001 inches, but never 12 inches exactly, for I can keep measuring down to the molecular level, and beyond.
 
  • #5
kingwinner said:
Definition: Consider a set of random variables X1,X2,...,Xn, the "order statistics" are the same random variables arranged in increasing (actually nondecreasing) order, i.e. X(1)<X(2)<...<X(n)

After reading a few more paragraphs, I immediately ran into deep confusion because I cannot make any sense of it...

My confusions:

1. X1,X2,...,Xn are random VARIABLES (not constants), so you don't know exactly what the outcome will be. For example the set of possible values of each Xi may be the interval [0,1]. How can you possibly order something that you can't even tell the exact value? (this is like saying [0,1] is less than [0,1]) This makes no intuitive sense to me...


2. "Distributions of Xi's and X(i)'s are NOT the same."
How can this be true? They are just the SAME random variables ordered in a different way, how can they distributions possibly be different? I can't make sense of this either...


3. "If the random variables X1,X2,...,Xn are continuous, the equality signs in X(1)<X(2)<...<X(n) can be ignored."
Now...I can't see WHY?


Could someone please explain? I have read the related material in 2 different textbooks, but they are saying pretty much the same thing and do not clear my doubts at all.

Any help would be appreciated!

1) If you roll a die five times, can you arrange the die rolls in nondecreasing order?

2) Let's return to our die example. Let's say you roll a die five times with X1, X2, X3, X4, X5 being the rolls. And X(1),...,X(5) being the rolls in nondecreasing order. Does X1 have the same distribution as X(1)?
 
  • #6
1) So you're referring to things that are going on after the experiment has been done, namely "particular values" which would typically be labeled by small letters (e.g. x,y,z) instead of capital letters for random variables (e.g. X,Y,Z).
If we know the outcomes, then sure we can order them. But I thought the whole point of RANDOM variables is that we cannot predetermine the outcome (we don't know the outcome until we run the trial).

2) I think so. The probability would just be 1/6 for each of 1,2,3,4,5,6 on each side of the die, right?

3) I think I am OK with the idea "If X is a continuous random variable, then P(X=2)=0", but what if P(X=Y), i.e. X=Y where Y is another random variable instead of X=(a constant)?

Thanks!
 
  • #7
kingwinner said:
1) So you're referring to things that are going on after the experiment has been done, namely "particular values" which would typically be labeled by small letters (e.g. x,y,z) instead of capital letters for random variables (e.g. X,Y,Z).
If we know the outcomes, then sure we can order them. But I thought the whole point of RANDOM variables is that we cannot predetermine the outcome (we don't know the outcome until we run the trial).

2) I think so. The probability would just be 1/6 for each of 1,2,3,4,5,6 on each side of the die, right?

3) I think I am OK with the idea "If X is a continuous random variable, then P(X=2)=0", but what if P(X=Y), i.e. X=Y where Y is another random variable instead of X=(a constant)?

Thanks!

1) It's just something you need to reread the definitions and think about it some more.

2) Actually, the probabilities aren't equal. The probability that X1 is equal to 6 is 1/6. But the probability that X(1) is equal to 6 is actually 1/7776. Because, if the smallest roll is equal to 6, then ALL the rolls are equal to 6.
 
  • #8
ModernLogic said:
1) If you roll a die five times, can you arrange the die rolls in nondecreasing order?
Yes. Think of 5 positions. In the 1st position you put the lowest of the observed. In the 2nd position put the next higher and so on ...up to the 5th position, where you put the highest.
Then X(i)= value in i th position. Note:X(i) is just a variable name whence x(i) is the observed value of that named variable.
 
  • #9
Let's consider the following example:
Play with two dice, each of 36 possible pairs (X1,X2) has the same probability. In particular, for any i in {1,2,3,4,5,6}, probability of X2=i is 1/6. But for "ordered version X(1)<=X(2)" the probability that X(2)=1 is 1/36 ((1,1) only will do), probability that X(2) =2 is ?(only pairs (1,2), (2,2), ?(2,1)? as outcome of (X1,X2) will do)

I can see that (1,2),(2,2) should be included in the probability, but is the pair (2,1) part of it, too?

If you can help me with this, then I think I will have understood the idea.

Thanks!
 
  • #10
(X1, X2) = (2, 1) will do it too, since in that case (X(1), X(2)) = (1, 2). Thus, your example shows that P(X(2) = 2) = 3/36.
 
  • #11
So "sometimes" X(2)=max(X1,X2) is equal to X1 and "sometimes" X(2)=max(X1,X2) is equal to X2, even if we are talking about the same problem?


Also, for X(2)=max(X1,X2), can the max(...) be treated as a "function"? (like X(2)=f(X1,X2)?)
 
  • #12
Say, for example, if X1 takes on possible values 1,2,3,4,5,6, and X2 takes on possible values 4,5,6,7
Then how can I "order" the random variables X1 and X2?
max(X1,X2)=?
 

FAQ: Order Statistics Definition: Explained

What is the definition of order statistics?

Order statistics is a mathematical concept that refers to the arrangement of a set of data points in ascending or descending order. It is used to analyze and describe the characteristics of a dataset, such as the minimum, maximum, median, and quartiles.

How is order statistics different from regular statistics?

Regular statistics involves analyzing the entire dataset as a whole, while order statistics focuses on the individual data points and their positions within the dataset. Order statistics is particularly useful for understanding the distribution and variability of a dataset.

How is order statistics used in real-life scenarios?

Order statistics is commonly used in fields such as economics, finance, and engineering to analyze and model data. It can be used to determine the best and worst performing assets in a portfolio, identify the most frequently occurring values in a dataset, and make predictions about future trends.

What is the significance of order statistics in probability theory?

In probability theory, order statistics play a crucial role in understanding the distribution of a random sample. It allows for the calculation of important measures such as the sample mean, standard deviation, and variance, which are essential for making statistical inferences.

Can order statistics be applied to non-numerical data?

Yes, order statistics can be applied to non-numerical data, such as rankings or categories. In this case, the data points are ordered based on their position in the ranking or category rather than their numerical value.

Back
Top