Sample mean and linear relation?

In summary, the book explains that the computation of the sample mean can be simplified by choosing constants a and b such that the transformed data set can be easily calculated. In the given example, the constant a is 1 and b is -280, which converts the problem into a simpler one. The formula for finding the mean of the transformed data set is \bar{y}=a\bar{x}+b, and to get the original mean, we solve for \bar{x} using the known values of \bar{y}, a, and b.
  • #1
jwxie
282
0

Homework Statement



The book states the following:

The sample mean is defined by: [tex]\bar{x}=\sum_{i=1}^{n}x_{i}/n[/tex]

The computation of the sample mean can often be simplified by noting that if for constants
a and b, [tex]y_{i}=ax_{i}+b[/tex], then the sample mean of the data set y1 , . . . , yn is: [tex]\bar{y}=\sum_{i=1}^{n}(ax_{i}+b)/n=\sum_{i=1}^{n}ax_{i}/n+\sum_{i=1}^{n}b/n=a\bar{x}+b[/tex]


Given the question:


The winning scores in the U.S. Masters golf tournament in the years from
1982 to 1991 were as follows: 284, 280, 277, 282, 279, 285, 281, 283, 278, 277

The book computes as follows:

Rather than directly adding these values, it is easier to first subtract 280 from
each one to obtain the new values yi = xi − 280, we obtain:

4, 0, −3, 2, −1, 5, 1, 3, −2, −3

ecause the arithmetic average of the transformed data set is
y_bar = 6/10
̄
it follows that
x_bar = y_bar + 280 = 280.6
(1) Why are we choosing 280?? What is the reason for that? I tried other numbers don't they don't give the same sample mean.

(2) I also want to confirm my understanding of the linear relation given the summation. I know the constant a should be the 1/n, as x1 / n + x2 / n + xn / n is the same as (x1+x2+xn)/n. Why do we need the b, the intercept? Is it just stating the obvious ,a general form? Or is it possible that we will encounter a statistical sample mean that cross the y?

But what I just said don't make sense to "yi = xi − 280". the constant a is 1...Can someone correct me? Thanks!
 
Physics news on Phys.org
  • #2
Let's choose 20 instead of 280. Then, new data set becomes

264 260 257 262 259 265 261 263 258 257

hence, y_bar = 260.6. It follows that x_bar = 20 + 260.6 = 280.6

y = ax + b is the general form of the linear relation between x and y.
 
  • #3
The book states that if the mean [tex]\bar{x}=\sum_{i=1}^{n}x_{i}/n[/tex]

Then for another distribution which is related as [tex]y_{i}=ax_{i}+b[/tex]

the mean is [tex]\bar{y}=a\bar{x}+b[/tex]

what they are trying to imply here is that by properly choosing 'a' and 'b' the process of finding mean can be simplified.

in the example solution you mentioned

a = 1, b = -280 (the values can be any real number, but it is chosen in such a way to make the problem simpler)

so the equation becomes y[itex]_{i}[/itex] = x[itex]_{i}[/itex] - 280

this effectively converts the problem into a much simpler problem involving small numbers. Once we compute [tex]\bar{y}[/tex]
we have to get back
[tex]\bar{x}[/tex]
This is done by solving [tex]\bar{y}=a\bar{x}+b[/tex]
( [tex]\bar{y}[/tex] , a ,b are known)

This gives the answer.
 

FAQ: Sample mean and linear relation?

What is the sample mean?

The sample mean is a measure of central tendency that represents the average value of a set of data. It is calculated by adding all the values in the sample and dividing by the number of values.

How is the sample mean different from the population mean?

The sample mean is calculated using a subset of a larger population, while the population mean represents the average value of the entire population. The sample mean is often used to estimate the population mean.

What is a linear relation?

A linear relation is a relationship between two variables that can be represented by a straight line on a graph. In this type of relation, a change in one variable is directly proportional to a change in the other variable.

How is the sample mean used to measure linear relation?

The sample mean can be used to determine whether there is a linear relationship between two variables by calculating the slope of the line of best fit. If the slope is close to zero, there is no linear relation, but if it is significantly different from zero, there is a linear relation.

Why is the sample mean important in scientific research?

The sample mean is important in scientific research because it helps to summarize and describe a large set of data. It can also be used to make predictions and draw conclusions about a population. Additionally, the sample mean is often used to test hypotheses and evaluate the effectiveness of treatments or interventions.

Similar threads

Back
Top