- #1
Jean Tate
- 27
- 4
Can anyone help me with this, please?
It's about how you go about trying to decide if two distributions are consistent, statistically speaking; specifically, what statistical test, or tests, is (are) most appropriate to use.
Here's the data:
N(A) N(B) G/R/P
0043 0046 #101
0264 0235 #102
0033 0029 #103
N(A) N(B) G/R/P
0172 0201 #201
1686 1496 #202
1444 1336 #203
Astronomical observations were made, and reduced to data. By two quite different teams, using different telescopes, cameras, data reduction routines, etc. In the first two columns (N(A) and N(B)) are counts, with leading zeros to ensure everything lines up nicely. "A" and "B" are two states, or conditions, or ... they are distinct and - for the purposes of this question - unambiguous. So the first cell of the first table says 43 cases of A (or with condition A) were observed.
The third column (G/R/P) is the name/label of the group/region/population observed. The two teams each observed the same group/region/population; the first table is the first team's data, the second the second.
There is nothing to say what the underlying ("true") distribution is, or should be. Nor any way to compare what the two teams observed: the 43 could be a proper subset of the of 172 (first column, first row), an overlap, or disjoint. However, assume no mistakes at all in the assignment of "A" and "B".
Clearly, the two distributions - of states A and B, across the three groups/regions/populations - are different. However, is that difference statistically significant? What test - or tests - are appropriate to use, here?
More details? Consider these:
i) what's observed is white dwarf stars, in three different clusters; A is DA white dwarfs, B DB ones
ii) globular clusters, in three different galaxies; A is 'red' GCs, B 'blue'
iii) spiral galaxies, in three different galaxy clusters; A is 'anti-clockwise', B 'clockwise'
iv) radio galaxies, in three different redshift bins; A is 'FR-I', B 'FR-II'
v) GRBs, in three different RA bins; A is 'long', B is 'short'
(I don't think the details matter, in terms of the type of statistical test to use; am I right?)
It's about how you go about trying to decide if two distributions are consistent, statistically speaking; specifically, what statistical test, or tests, is (are) most appropriate to use.
Here's the data:
N(A) N(B) G/R/P
0043 0046 #101
0264 0235 #102
0033 0029 #103
N(A) N(B) G/R/P
0172 0201 #201
1686 1496 #202
1444 1336 #203
Astronomical observations were made, and reduced to data. By two quite different teams, using different telescopes, cameras, data reduction routines, etc. In the first two columns (N(A) and N(B)) are counts, with leading zeros to ensure everything lines up nicely. "A" and "B" are two states, or conditions, or ... they are distinct and - for the purposes of this question - unambiguous. So the first cell of the first table says 43 cases of A (or with condition A) were observed.
The third column (G/R/P) is the name/label of the group/region/population observed. The two teams each observed the same group/region/population; the first table is the first team's data, the second the second.
There is nothing to say what the underlying ("true") distribution is, or should be. Nor any way to compare what the two teams observed: the 43 could be a proper subset of the of 172 (first column, first row), an overlap, or disjoint. However, assume no mistakes at all in the assignment of "A" and "B".
Clearly, the two distributions - of states A and B, across the three groups/regions/populations - are different. However, is that difference statistically significant? What test - or tests - are appropriate to use, here?
More details? Consider these:
i) what's observed is white dwarf stars, in three different clusters; A is DA white dwarfs, B DB ones
ii) globular clusters, in three different galaxies; A is 'red' GCs, B 'blue'
iii) spiral galaxies, in three different galaxy clusters; A is 'anti-clockwise', B 'clockwise'
iv) radio galaxies, in three different redshift bins; A is 'FR-I', B 'FR-II'
v) GRBs, in three different RA bins; A is 'long', B is 'short'
(I don't think the details matter, in terms of the type of statistical test to use; am I right?)