Calculating Averages: Not so simple after all?

  • Thread starter Astro
  • Start date
In summary: I average (1+2+3+4+6+1+2+3)/8, I get 2.75 as the result. Why is this so? I get a different average each time I divide these values into different sets. While 2.75 is the correct answer, is there a way to find a correction factor which can be applied to these other averages to arrive at the correct average?
  • #1
Astro
48
1
Maybe you math guys & gals will have more luck than the physics people. Basically, I calculated an average two different ways and got two different answers. The problem is that both answers should be the same and I don't understand why they are not. My question is "why are they not the same" or "why is one wrong"? To see my data and work, please see the attachment. In particular, the problem is: why does method1-answer not= method2-answer?

Thanks.
 

Attachments

  • Attachment.pdf
    18.1 KB · Views: 294
Physics news on Phys.org
  • #2
Neither of them look right. First off, the average should be:

(total # of class hours) / (total # of classes)

right? Secondly, I don't see how the next step follows in either of your approaches.
 
  • #3
Some weeks are busier than others. That means more classes and more hours. The least busiest week is what I have termed the 'low average'. The busiest a week can ever be is what I have termed the 'high average'. On average, a week is busier than the least-busiest-week and less busy than the most-busiest-week.

I need all three types of averages for my project. If I only wanted to find the overall average I would approach it like you suggested. In fact, that's exactly what I did in method #1. The only difference is that I am more specific in my terminology. Instead of saying average=(total # of class hours)/(total # of classes) I said average =(total # average of class hours) / (total # average of classes). That extra word is important because without it it's not clear if you're trying to calculate the low, high, or overall average. If you take a careful look at my table you will see that I have indicated that some classes do not run every week. If every class did run every week then , yes, I would have answered the question exactly as you suggested.

Since I had calculated the 'low average' and 'high average" already, I thought it might be easier to average the two to find the overall average instead of doing the whole calculation from scratch. (low+high)/2 is what I did in method #2.

What I can't explain is why method #1 and method #2 give different answers. That's what I need help with. It became obvious that the two methods are not mathematically equivalent but I really don't understand why not. To me, it seems that they should be.

Anyway, thanks for trying. You're the one who posted a reply, so far, and I appreciate it.

~ Astro ~
 
  • #4
I thought it might be easier to average the two to find the overall average instead of doing the whole calculation from scratch. (low+high)/2 is what I did in method #2.

What I can't explain is why method #1 and method #2 give different answers.
Can you explain why you think they should be the same?


There's a classic (pseudo)paradox: while driving to my friends house, I averaged 45 miles per hour. While driving home, I averaged 55 miles per hour. What was my average speed for the round trip?

Average speed is, of course, (total distance) / (total time). If the distance between our houses is X, then:

Total distance = 2X
Total time = (time there) + (time back) = X/45 + X/55 = 4X/99.
Average speed: 99/2 = 49.5 MPH.

Of course, if you're not thinking, you might simply take the (unweighted) average of 45 and 55 and get 50... but since this computation is not computing (total distance) / (total time), there's no reason to think you should get the right answer.

(Of course, taking the appropriate weighted average of the two speeds does work)
 
  • #5
Thanks Hurkyl!

Yes, I think I understand now. Your example helped quite a bit although I had to think about it for a while.

(See new attachment for more info.)
 

Attachments

  • Eureka__.pdf
    64.9 KB · Views: 256
Last edited:
  • #6
Hi,

I am facing a similar problem. What I am trying to do is average a series of values by dividing them into different sets. For example, let us take 8 values, 1, 2, 3, 4, 6, 1, 2, and 3. When I average ((1+2+3)/3 + (4+6)/2 + (1+2+3)/3)/3, I get 3 as the result. Whereas when I average (1+2+3+4+6+1+2+3)/8, I get 2.75 as the result. Why is this so? I get a different average each time I divide these values into different sets. While 2.75 is the correct answer, is there a way to find a correction factor which can be applied to these other averages to arrive at the correct average?

Regards,
Santosh
 
  • #7
Santosh_J said:
Hi,

I am facing a similar problem. What I am trying to do is average a series of values by dividing them into different sets. For example, let us take 8 values, 1, 2, 3, 4, 6, 1, 2, and 3. When I average ((1+2+3)/3 + (4+6)/2 + (1+2+3)/3)/3, I get 3 as the result. Whereas when I average (1+2+3+4+6+1+2+3)/8, I get 2.75 as the result. Why is this so? I get a different average each time I divide these values into different sets. While 2.75 is the correct answer, is there a way to find a correction factor which can be applied to these other averages to arrive at the correct average?

Regards,
Santosh

What you're doing is actually very interesting because it is very similar to how statistics works.

You could think of taking the average of all of the numbers as finding the true average of that population of numbers.

Taking the average of all small subsets is similar to statistical sampling. If you kept taking the average of small sets (for example (1+2+3)/3 = 2) and repeated that process with different (randomly chosen) sets, you would expect the average of those averages to approach the population average of 2.75.

You don't get that because you haven't taken enough samples.

Another thing is the samples you take aren't of the same size, so things get weighted unfairly.

But without thinking about that too much, Try doing this (sorry I don't have a picture):

Draw a number line at the top of a page from 0 to 3

Now we know that the overall average will be 2 (by adding 1+2+3 and dividing by 3), so draw a line directly down from the 2. You can see that the line is directly between 1 and 3, and you can draw an inverted triangle leading to these numbers. From this you can visually see that when you take an unweighted average of numbers that increment by the same amount, the result is exactly in the middle of the numbers.

Now draw an inverted pyramid for the average of 1 + 2. Again it's directly between the numbers. Now draw a pyramid showing the average between the average you just found for (1+2)/2 and 3. So the line you draw is directly between the average of 1+2 and 3. See how things get skewed higher?

This is visually what is occurring when you take smaller sets and average their averages. What happens to the picture when the smaller sets all have the same size?

I hope that makes sense (wish I had a pic!).

Edit: you could draw the line up and then the pyramids won't be inverted. Just not the way I drew it on my white board. :P
 
Last edited:

FAQ: Calculating Averages: Not so simple after all?

What is the formula for calculating an average?

The formula for calculating an average is to add up all the numbers in a set of data and then divide the sum by the total number of values in the set.

Can you explain the difference between mean, median, and mode?

Mean is the average of a set of data, calculated by adding up all the numbers and dividing by the total number of values. Median is the middle value in a set of data when the values are arranged in numerical order. Mode is the most frequently occurring value in a set of data.

How do outliers affect the calculation of an average?

Outliers, which are extreme values that are significantly different from the rest of the data, can greatly affect the calculation of an average. They can skew the results and make the average less representative of the data as a whole.

What are the limitations of using averages in data analysis?

Averages do not provide a complete picture of the data and can be misleading if there are outliers or if the data is not normally distributed. They also do not account for the variability or range of the data.

Are there any alternative methods for calculating averages?

Yes, there are alternative methods for calculating averages, such as weighted averages, geometric means, and trimmed means. These methods may be more appropriate in certain situations where the data is not normally distributed or there are outliers present.

Back
Top