Comparing Z Scores on Two Different Tests

  • #1
Agent Smith
278
30
TL;DR Summary
Using z scores to compare scores)
Capture.PNG


I believe the z-score tells us Cindy did better in her test than Bobby.

I also computed the following:
Capture.PNG

Boys, on average, score better than girls.
 
Physics news on Phys.org
  • #2
@berkeman thank you for the edit. The question is clearer now.
 
  • Like
Likes Gavran and berkeman
  • #3
It doesn't say whether the two are taking different tests. I say yes. It appears you say no.

That computation you did I don't understand. I can't tell what question it is meant to answer.
 
  • #4
I don't think they were taking the same test. The z-scores are meant to answer the query "who did better?". The other part of the "answer" is the difference in the distributions (Bobby's class and Cindy's class).
 
  • #5
The first part makes sense.
The second part (your calculation of the statistical parameters of X-Y) does not. You can not talk about X-Y without specifying how the X and Y values are paired. The formula for ##\sigma_{X-Y}## depends on the covariance of the two: var(X-Y)=var{X}+var(Y)-2cov(X,Y). That means that the sets of X and Y values must have an equal number or elements and you need to know how they are paired up.

UPDATE: You can specify that you determine an X value and randomly select a Y value (with replacement). Then you can say the X and Y values are uncorrelated and you do not need to have equal numbers taking the two tests.
 
Last edited:
  • #6
I see.

Capture.PNG


So this is meaningless. 🤔
 
  • #7
Agent Smith said:
I see.

View attachment 353364

So this is meaningless. 🤔
I'm afraid so. I would interpret 0 as meaning there is no difference, that was a clue.
 
  • #8
69,15 % of the girls scored higher than Cindy and it is, as @Agent Smith said in the original post, more than 50 %.
 
  • #9
@Hornbein & @FactChecker thanks for responding. My stats course had a chapter on combining random variables and I drew from that to do the computation, which both of you find not to make sense.

I'm sure you've encountered these in your statistical careers: ##\mu_{X - Y}## where X and Y are random variables. Do you have a link that explains the idea in better way? Does the test have to be the same for the 2 groups (Bobby's and Cindy's classes)?

@Gavran gracias for doing the computation and finding the exact percentage. Did you take a look at the second part of my question, the ##\mu_{X - Y}##? Since ##\mu_{X - Y} > 0##, the boys, "on average", did better than the girls. Does the test have to be the same for the 2 groups (Bobby's and Cindy's classes)?

The z-score result I calculated in the OP apply even if the tests were different? May be they have to be the same level/grade and the same subject or not? ???
 
  • #10
Agent Smith said:
@Hornbein & @FactChecker thanks for responding. My stats course had a chapter on combining random variables and I drew from that to do the computation, which both of you find not to make sense.

I'm sure you've encountered these in your statistical careers: ##\mu_{X - Y}## where X and Y are random variables. Do you have a link that explains the idea in better way? Does the test have to be the same for the 2 groups (Bobby's and Cindy's classes)?

No. You can find the difference between means of any two random variables you like. It's a valid operation. In this case the difference is 5. But since they are taking different tests, who cares? This is applied mathematics, so such questions matter.

Since the Bobby and Cindy are taking different tests comparing their raw scores is meaningless as to who performed better. This is why we instead compare Z scores. (That's not something I'd do but this is a test question making a point, not serious stuff.)
 
Last edited:
  • Like
Likes Agent Smith
  • #11
Hornbein said:
But since they are taking different tests, who cares?
Hornbein said:
raw scores
Since the scores by themselves don't seem to help us tell who's done better, we use a z score. I just read another thread on z scores and the author says he used it to normalize the data(?). What does that mean?
 
  • #12
Agent Smith said:
@Hornbein & @FactChecker thanks for responding. My stats course had a chapter on combining random variables and I drew from that to do the computation, which both of you find not to make sense.
You can not just talk vaguely about adding a set of X results to a set of Y results. You have not specified how the values of X and Y are paired. The sum of two random variables means that for one summed value, you get a value for X and a value for Y and you add them. That means those two values are paired. The full equation for ##\sigma_{X-Y}## includes a term for the correlation between paired X and Y values. If they are uncorrelated, your simplified equation is correct. Otherwise, it may not be. Suppose that you start with the X values from lowest to highest and match each one with a Y value, also going from lowest to highest. Then there is a strong correlation and your equation for ##\sigma_{X-Y}## is wrong.
You can force the paired X and Y values to be uncorrelated by specifying that for every X value, you randomly select a Y value, with replacement. ADDED: (Even then, they are only uncorrelated in the theoretical infinite sample case. Your finite example is just a sample estimate of the theoretical value and might easily be correlated, even with a random selection of Y values to pair with X values.)

PS. The equation ##\mu_{X-Y}=\mu_X-\mu_Y## is correct, even if the random variables X and Y are correlated. It is just a probability-weighted integral, which is linear.
Agent Smith said:
I'm sure you've encountered these in your statistical careers: ##\mu_{X - Y}## where X and Y are random variables. Do you have a link that explains the idea in better way? Does the test have to be the same for the 2 groups (Bobby's and Cindy's classes)?
No. The tests do not have to be the same.
Agent Smith said:
@Gavran gracias for doing the computation and finding the exact percentage. Did you take a look at the second part of my question, the ##\mu_{X - Y}##? Since ##\mu_{X - Y} > 0##, the boys, "on average", did better than the girls. Does the test have to be the same for the 2 groups (Bobby's and Cindy's classes)?
No. The tests do not have to be the same.
 
Last edited:
  • #14
🤔 Non liquet
 
  • #15
Agent Smith said:
@Gavran gracias for doing the computation and finding the exact percentage.
The result was not got by computation. It was got by reading from the table. See https://en.wikipedia.org/wiki/Standard_normal_table#Table_examples.

Agent Smith said:
I just read another thread on z scores and the author says he used it to normalize the data(?). What does that mean?
Standardization of a random variable is the process of converting a random variable into a random variable with a standard normal distribution. This is the same as computing z-scores for a random variable. This is why sometimes the standard normal distribution is called the z-score distribution and the standardization is called the z-score normalization.
 
  • Like
Likes Agent Smith and FactChecker
  • #16
Gavran said:
The result was not got by computation. It was got by reading from the table.
Ok.

Gavran said:
converting a random variable into a random variable with a standard normal distribution
Ok.
How would you define normalization/standardization?

@FactChecker replied NO, the tests do not have to be same. This isn't clear to me, because that means I can compare performance in one subject to that in another. 🤔
 
  • #17
Capture.png


🤔
 
  • #18
Agent Smith said:
@FactChecker replied NO, the tests do not have to be same. This isn't clear to me, because that means I can compare performance in one subject to that in another. 🤔
There are a lot of ways to compare performances on two different tests. Some ways are more reasonable and useful than others.
You can compare the relative performance on one test to the relative performance on another test. If I took two tests and got a perfect score on one but failed the other, I could say that I did better on the first test. That would even be true if the first test was an English test and the second one was a math test.
 
  • Like
  • Skeptical
Likes Gavran and Agent Smith
  • #19
The point about normalised scores is that they say where you are compared to other people who took the test. Zero is exactly average, +1 means you scored better than 84% of the people who took that test. Comparing Bobby and Cindy's normalised scores therefore tells you who is higher in their class rankings.

I would say, though, that there are a lot of other factors that aren't specified in the question that affect whether Bobby "did better" than Cindy. If you give university maths professors an exam, half of them would have negative z scores. If you give year 10s a maths exam half of them would have positive z scores. Are they really better at maths than the lower half of maths professors? The point is that the populations being tested might differ (I've pre-selected good mathematicians in one population), and the tests might be harder or easier (the year 10s couldn't even start most of the questions you could ask maths professors).

That said, you can often use z scores to compare ability as long as you are reasonably certain that the populations aren't crazily different and neither are the tests. If Cindy is in year 9 and Bobby in year 10 they're taking different tests, but they're in the same pipeline which has fairly few "holes" where people leave or enter the population. If Cindy has a better z score then she'll probably score better than Bobby did when she gets to the test he took. Or you might use it on yourself - if your z score in English is better than your z score in maths then you could take that as evidence that you're better at English than maths. But even so, you do have to consider the possibility that you had the misfortune to be in class with Gauss, Euler and Leibniz.
 
  • Like
Likes Agent Smith
  • #20
A little clearer, gracias @Ibix

Can you comment on 👇

Capture.png
 
  • Like
Likes Gavran
  • #21
Agent Smith said:
A little clearer, gracias @Ibix

Can you comment on 👇

View attachment 353450
I think you're trying to find the parameters of the distribution of differences in scores. But, as others have commented, which differences? If your two datasets are two tests of the same people (English and maths, or before-and-after some medical treatment) then you can compute the differences for each person and examine the distribution of that statistic. Other relationships are possible, of course - pairs of siblings where the older takes Bobby's test and the younger takes Cindy's, for example.

But the difference between your English score and my maths score isn't really meaningful, and the distribution of that difference isn't well defined - why were we paired up? There'd be different statistics if people were paired in a different way. So you're trying to compute statistics of a distrbution that isn't well defined, which just won't work.
 
  • #22
@Ibix it's very confusing. I kinda sorta grok why a z score will assist in finding out whether Bobby or Cindy did better, but I calculated the "[ ... ] differences in scores" and found out that on average boys did better.
 
  • #23
You are using "did better" in two different senses.

Comparing the z scores tells you that Cindy is ranked higher among those who took her test than Bobby is among those who took his. In that sense, Cindy "did better" than Bobby. She did better compared to her peers on the test she took than Bobby did compared to his peers on the test he took.

Comparing the raw means tells you that the people who took the test Bobby did, on average, scored higher than those who took the test Cindy did. However, there is a much larger spread in the scores in Cindy's test, so some of the high performers probably scored better than high performers who took Bobby's test. That much is valid. If you knew the numbers in each class you could compute the variances of the means and form a rough opinion on whether there's a statistically significant difference (formally you'd probably want to do a load more careful work, but the difference in means divided by the standard deviation of the mean is a finger-in-the-wind estimate). In that sense, Bobby's class did better on average than Cindy's class, although individuals will have scored higher or lower.

Where you went wrong was introducing Bobby and Cindy's individual scores into that. You can certainly talk about difference in the means of populations ("on average, my class did better than yours") or you can talk about individual differences ("I did better than you"), but you can't really mix them up like you did. You can talk about a population of differences when the differences are well-defined. For example, consider weight before and after a weight loss treatment. We're interested in the pairs of measurements made on one person, and no other pairing would make sense - my before weight minus your after weight is meaningless.

But when comparing class scores on a test, why would you pair up Bobby and Cindy and not Bobby and Diana? There might not even be equal numbers in the two classes to pair up. Bobby can certainly choose to compare his raw score and/or his z score to Cindy's. But there's no systematic way to pair up each person who took one test to one person who took the other, so there's no defined population of "difference in score" to do statistics on.
 
  • Like
Likes Agent Smith
  • #24
@Ibix and @Gavran much more complicated than I expected. I believe I misunderstood the question. Gracias.
 
  • Like
Likes Ibix
  • #25
Worth noting that arguing over the right way to do this kind of comparison is the basis of the perennial "test scores are improving" / "no, tests are being dumbed down" debate (complicated by a healthy dose of bias depending on whose preferred political party is in power, of course). If it were an easy topic with clear answers it'd probably be less of a political football.
 
  • Like
Likes Agent Smith and FactChecker
Back
Top