- #1
neil.thompson
- 10
- 0
Hi everyone.
So I'm afraid I don't really know much about statistics, but I am trying to learn by working through a book, and taking some examples (I have mathematics experience, but from a biological perspective).
Just now, I am looking at the hypergeometric probability distribution. I have access to MATLAB so I have been playing around with examples in that. As I understand it, the hypergeometric probability distribution gives the probability of a number of positive results, given selection of a sample from a greater set (where the total successes are known). That seems simple enough.
However, I also expect that, as with everything else, the total probability sums to 1. So I am trying examples in MATLAB (http://www.mathworks.co.uk/help/toolbox/stats/hygecdf.html - there is an example on how to use it there too) and obviously doing something wrong? Imagine a total set of 61 balls. 30 of them are red and 31 of them are black. I take a sample of 34, without replacement, finding that 14 are red, and 20 are black.
I think this is OK - there seems to be no requirement on equal divisions between the colours or anything like that. So I run, for the cumulative probability:
Red=hygecdf(14,60,30,34);
Black=hygecdf(20,61,30,34);
I get Red = 0.1260, and Black = 0.9520. I think that these two calculations should be equivalent - they both have the same sample size (34) and that they should sum to 1, but obviously they do not - I am doing something very basic wrong. !
Sorry for all the words!
thank you,
Neil.
So I'm afraid I don't really know much about statistics, but I am trying to learn by working through a book, and taking some examples (I have mathematics experience, but from a biological perspective).
Just now, I am looking at the hypergeometric probability distribution. I have access to MATLAB so I have been playing around with examples in that. As I understand it, the hypergeometric probability distribution gives the probability of a number of positive results, given selection of a sample from a greater set (where the total successes are known). That seems simple enough.
However, I also expect that, as with everything else, the total probability sums to 1. So I am trying examples in MATLAB (http://www.mathworks.co.uk/help/toolbox/stats/hygecdf.html - there is an example on how to use it there too) and obviously doing something wrong? Imagine a total set of 61 balls. 30 of them are red and 31 of them are black. I take a sample of 34, without replacement, finding that 14 are red, and 20 are black.
I think this is OK - there seems to be no requirement on equal divisions between the colours or anything like that. So I run, for the cumulative probability:
Red=hygecdf(14,60,30,34);
Black=hygecdf(20,61,30,34);
I get Red = 0.1260, and Black = 0.9520. I think that these two calculations should be equivalent - they both have the same sample size (34) and that they should sum to 1, but obviously they do not - I am doing something very basic wrong. !
Sorry for all the words!
thank you,
Neil.