Meaning of PDFs in the context of statistics

Cole A. · Jan 31, 2013

This is a really basic point that I am getting held up on.

In probability theory, we use PDFs and PMFs to describe random variables. (The term "random variable" carries a certain connotation of "unpredictable variations when repeated.") For example, the number of heads we will see in 50 flips of a coin is a random variable. This number will change from one set of 50 flips to the next. And the changes cannot be predicted beforehand. The random variable could be described by a Bin(50, p) PMF. We could even set p = 0.5 if the coin was fair.

But in statistics, PDFs and PMFs still seem to be used, to describe any quantity that we do not know for sure --- even when there isn't a scope of repetition associated with that quantity, i.e. the quantity has a real, fixed, unchanging value, just presently unknown to us. This is what is confusing me: If someone looks at a building and says that its height in feet is described by N(100, 50), and another claims that its height is described by Unif(0, 200), what are they saying exactly? The building's height is an absolute, fixed, unchanging number. What meaning does a PDF possibly have in this context?

Simon Bridge · Jan 31, 2013

We use the probability functions because, even though the result of a particular event is not predictable, the pattern of the unpredictability is, itself, predictable. You should have seen this already. It is very useful, for example, to realize that the sum of two dice is most likely to be 7, and cannot be more than 12 or less than 2. There are fortunes being made by being able to predict just how often each possible number will show up in many rolls.

In the case of a physical object's height, it is an open matter as to whether or not it has one certain and absolute height at a particular time or not as there is no way of telling.

All we can do is measure it to finite precision with our equipment. When someone reports a PDF for a distribution of possible heights, they are telling you that the outcome of many repeated measurements of the height will return values that conform with that distribution.

If someone says "I will build you a building with a height distributed as follows..." then they are telling you how well they can build to the specification. This is why we have margins for error and error analysis.

Cole A. · Jan 31, 2013

Simon Bridge said:

When someone reports a PDF for a distribution of possible heights, they are telling you that the outcome of many repeated measurements of the height will return values that conform with that distribution.

This clarified things immensely for me. Thank you.

Stephen Tashi · Jan 31, 2013

Cole A. said:

If someone looks at a building and says that its height in feet is described by N(100, 50), and another claims that its height is described by Unif(0, 200), what are they saying exactly? The building's height is an absolute, fixed, unchanging number. What meaning does a PDF possibly have in this context?

Unless you are studying Bayesian statistics, you don't find such statements in a statistics text.
In "frequentist" statistics, the kind normally studied in introductory courses, you would not find the height of one building described by a probability distribution. You might find the height of a randomly selected building from a population of buildings described by a distribution. You might find the measured height of a single building described by a probability distribution if the measurement has a random error.

If you are studying Bayesian statistics, you might use a probability distribution for the height of one building. The distribution can be regarded as stating a "belief" about the height or you can pretend that when the building was built, it's height was selected at random from a population of possible heights that might have occurred.

Simon Bridge · Feb 1, 2013

Hmmm... and N(100,50) feet seems a little odd - that would be a normal distribution with a mean at 100' and a variance of 250' ... suggesting there is a non-zero probability that the "building" is actually a basement. (Quite aside from that the tails of the normal distribution don't hit zero so there's a faint chance that the building is actually a space-elevator or a tunnel through the Earth.)

Meaning of PDFs in the context of statistics

FAQ: Meaning of PDFs in the context of statistics

What does PDF stand for?

What is the meaning of PDF in statistics?

How is a PDF different from a CDF?

What is the importance of PDFs in statistics?

How are PDFs used in real-world applications?

Similar threads

Hot Threads

Recent Insights