Work out an estimate for the total number of ponies in the forest

  • #1
paulb203
106
43
Homework Statement
Wyatt wants to work out an estimate for the total number of wild ponies in a forest

In July, Wyatt catches 42 ponies in the forest
He puts a tag on each of these ponies and releases them

In August, Wyatt catches 60 ponies in the forest
He finds that 5 of the 60 ponies are tagged

Work out an estimate for the total number of ponies in the forest
Relevant Equations
N/A
My question; Which branch of maths is this?
Also, can you give me a clue as to where to start regards solving this. Just a hint please, not a full explanation.

I'm struggling to even guess at this one. I did think, '60 ponies, 5 of which are tagged, so, 5/60 tagged, which is 1/12
1/12 of the 60 are tagged...
How does this relate to the initial number of 42 caught, and tagged, then released..?
Of the intial 42 caught, 100% were tagged...
Why did he only catch 42 in July, but caught 60 in August?

If only 5 of the 60 ponies were tagged that means 55 weren't. If 55 weren't, then none of those 55 belong to the first lot of 42 that were caught and tagged. So there is AT LEAST 97 ponies in the forest (?).

Q. They've asked for an estimate for the total number of ponies in the forest; do they want us to include any the 60 that Wyatt has caught at the time of asking, and hasn't released yet?
 
Physics news on Phys.org
  • #2
paulb203 said:
1/12 of the 60 are tagged...
If you catch 100 and find 30 have black manes, the rest brown manes, what would you guess about the proportion of black manes in the whole population?
 
  • Like
Likes MatinSAR, paulb203 and PeroK
  • #3
paulb203 said:
1/12 of the 60 are tagged...
How does this relate to the initial number of 42 caught, and tagged, then released..?
Of the intial 42 caught, 100% were tagged...
Why did he only catch 42 in July, but caught 60 in August?
Suppose you compare these two ratios:
(5 of second sample tagged)/(60 total in second sample) = 1/12
versus
(42 of entire population tagged)/(??? total in entire population)

How would you expect those ratios to compare? first greater? equal? second greater?
From that, can you estimate (??? total in entire population)

PS. This type of question is in the basic field of probability. Advanced examples are in the field of sampling theory.
 
Last edited:
  • Like
Likes paulb203 and Delta2
  • #4
paulb203 said:
Homework Statement: Wyatt wants to work out an estimate for the total number of wild ponies in a forest

In July, Wyatt catches 42 ponies in the forest
He puts a tag on each of these ponies and releases them

In August, Wyatt catches 60 ponies in the forest
He finds that 5 of the 60 ponies are tagged

Work out an estimate for the total number of ponies in the forest
Relevant Equations: N/A

My question; Which branch of maths is this?
Also, can you give me a clue as to where to start regards solving this. Just a hint please, not a full explanation.

I'm struggling to even guess at this one. I did think, '60 ponies, 5 of which are tagged, so, 5/60 tagged, which is 1/12
1/12 of the 60 are tagged...
How does this relate to the initial number of 42 caught, and tagged, then released..?
Of the intial 42 caught, 100% were tagged...
Why did he only catch 42 in July, but caught 60 in August?

If only 5 of the 60 ponies were tagged that means 55 weren't. If 55 weren't, then none of those 55 belong to the first lot of 42 that were caught and tagged. So there is AT LEAST 97 ponies in the forest (?).

Q. They've asked for an estimate for the total number of ponies in the forest; do they want us to include any the 60 that Wyatt has caught at the time of asking, and hasn't released yet?
The ideas in this question form the basis of statistical sampling. You can always have some fun with these questions. I like your answer of at least 97, because it might be hard to catch the same pony twice!

The underlying assumption is that Wyatt's method of pony counting is free from bias. So, 97 is definitely not the expected answer.

Have you enrolled in a statistical methods course without realising it?
 
  • Like
Likes paulb203
  • #5
I think this problem the way is stated is kind of misleading and doesn't guide the problem solver to think of a simple solution. If instead the problem was something like this:

"From a box that contains an unknown number of white balls we select 42 balls we paint them black and put them back to box. Then we select (with a random procedure) 60 balls from the box and we find 5 to be black. What is an estimate for the initial number of white balls in the box?

Then I think the above statement is not misleading and the problem solver would have an easier time to find the solution.
 
  • Skeptical
  • Like
Likes paulb203, haruspex and PeroK
  • #6
haruspex said:
If you catch 100 and find 30 have black manes, the rest brown manes, what would you guess about the proportion of black manes in the whole population?
Thanks, haruspex.

If I caught 110 and found 30 had black manes, and 70 had brown manes I would guess 30% of the whole population had black manes.

How does this relate to my question..?

The result of the second ‘round-up’ of ponies was;

He caught 60 and found 5 of them were tagged, 55 weren’t tagged.

So 5 of them had already been caught, in the first round-up (of 42), 55 hadn’t already been caught, they were first time captures.

Now, there are 60 captive ponies, 5 of which were caught first time around, and 37 ponies, at least, out in the forest, yes?
 
  • #7
FactChecker said:
Suppose you compare these two ratios:
(5 of second sample tagged)/(60 total in second sample) = 1/12
versus
(42 of entire population tagged)/(??? total in entire population)

How would you expect those ratios to compare? first greater? equal? second greater?
From that, can you estimate (??? total in entire population)

PS. This type of question is in the basic field of probability. Advanced examples are in the field of sampling theory.
Thanks, Factchecker

I would expect the ratios to be equal; 1/12=42/504

Is the answer;

An estimate for the total number of ponies in the forest is 504?

If so, I think I understand it better working backwards;

There are 504 ponies in a forest. Wyatt rounds up 42, tags them, then releases them. He then rounds up 60 to find 1/12 of them (5) are tagged. This is approximately what he expected as he had tagged 1/12 of the total population.

You said this is in the field of probability. My first thought, when I started to read your answer was proportion (direct proportion to be more specific); I’m guessing now that the field is probability; some of the tools are ratio and proportion?
 
  • #8
paulb203 said:
If so, I think I understand it better working backwards;

There are 504 ponies in a forest. Wyatt rounds up 42, tags them, then releases them. He then rounds up 60 to find 1/12 of them (5) are tagged. This is approximately what he expected as he had tagged 1/12 of the total population.

Oh yes it's easy if you start with the answer :wink:!

I like to solve this problem like this:
  • If you round up 60 ponies and ## \frac{5}{60} = \frac{1}{12} ## of them are tagged you can guess that ## \frac{1}{12} ## of all the ponies in the forest are tagged.
  • You know that exactly 42 of the ponies in the forest are tagged.
  • Now if 42 is exactly ## \frac{1}{12} ## of the ponies in the forest then there are exactly ## 42 \times 12 = 504 ## ponies in the forest in all.
  • But ## \frac{1}{12} ## is only an estimate, so our answer of 504 is only an estimate too.
  • Because we are working with estimates the best answer might be "I estimate that there are about 500 ponies in the forest".

paulb203 said:
You said this is in the field of probability.
The distinctions between fields of mathematics are a bit blurry but I think most people would consider this as in the field of statistics rather than probability.

paulb203 said:
some of the tools are ratio and proportion?
I don't think most people would consider these particularly as tools of statistics (or probability).

One important concept (or tool if you like) that is used here is the (unstated) assumption that the sample of 60 ponies is an unbiased sample.
 
  • Like
Likes paulb203, SammyS and FactChecker
  • #9
pbuk said:
I don't think most people would consider these particularly as tools of statistics (or probability
Well ratio and proportion are tools of all branches of mathematics but they are elementary tools, I guess you consider as tools of statistics the central limit theorem or the least square method.
 
  • Like
Likes paulb203 and FactChecker
  • #11
Delta2 said:
Well ratio and proportion are tools of all branches of mathematics but they are elementary tools, I guess you consider as tools of statistics the central limit theorem or the least square method.
I would prefer to call it a statistical or probability problem for the reason that the issue of drawing an unbiased, independent sample should be addressed. Admittedly, the problem statement does not say anything about that so it probably is not intended to be a statistical question.
 
  • Like
Likes paulb203
  • #12
paulb203 said:
If I caught 110 and found 30 had black manes, and 70 had brown manes I would guess 30% of the whole population had black manes.

How does this relate to my question..?
So change "black manes" to "tags".
 
  • Like
Likes paulb203
  • #13
pbuk said:
[*]Because we are working with estimates the best answer might be "I estimate that there are about 500 ponies in the forest".
[/LIST]

One important concept (or tool if you like) that is used here is the (unstated) assumption that the sample of 60 ponies is an unbiased sample.
Why would we assume that? Perhaps ponies usually go around in groups of 40-60. And, in the second sample we got a new group, plus a few stragglers from the first group that was rounded up the first time.

Honestly, IMO, "at least 97" is a better answer than "about 500".
 
  • Like
  • Skeptical
Likes paulb203 and pbuk
  • #14
The problem statement does not say if the estimate should be a number. It might be an interval.
The "97" is an estimate of the lower end of the interval, but not necessarily the lower end: some tagged ponies could have died between the July and the August counts.
 
  • Like
Likes paulb203 and PeroK
  • #15
Hill said:
The problem statement does not say if the estimate should be a number. It might be an interval.
The "97" is an estimate of the lower end of the interval, but not necessarily the lower end: some tagged ponies could have died between the July and the August counts.
Or left the forest!
 
  • Like
Likes paulb203
  • #16
Hill said:
The problem statement does not say if the estimate should be a number. It might be an interval.
The "97" is an estimate of the lower end of the interval, but not necessarily the lower end: some tagged ponies could have died between the July and the August counts.
I did say you can have some fun with these problems!
 
  • Like
  • Haha
Likes chwala, paulb203 and Hill
  • #17
There are all sorts of real-world issues that would need to be addressed in a problem like this that are not mentioned in the problem statement. IMO, that is a good reason to assume that the expected answer is a simple ratio calculation. It's just an academic exercise. In such a situation, I am not inclined to go looking for trouble with real-world issues.
 
  • Like
Likes Hornbein, paulb203, Delta2 and 1 other person
  • #18
PeroK said:
I did say you can have some fun with these problems!

I would guess that the OP is at the early stages of learning some statistics: how does you having fun help him here?
 
  • Love
Likes SammyS
  • #19
pbuk said:
I would guess that the OP is at the early stages of learning some statistics: how does you having fun help him here?
Problem does not state any projected 'variations' of the sample nor the population so one has to assume that the pony catcher was relying on homogeneity, rather than heterogeneity.

Question for you ( and all ).
Sampling - in this case all of the 60 ponies were collected, penned, and then counted, and all 60 released.
What if the pony catcher had collected one pony at a time, done his categorical counting, and then released to the wild population.
Would this have effected the statistical analysis?
Normally one assumes that the population is much larger than the sample. In this case it seems to be about 10%.
 
  • Care
  • Like
Likes paulb203 and Delta2
  • #20
Hill said:
some tagged ponies could have died between the July and the August counts.
That was one of the reasons I wrote what i wrote at post #5. Although in my equivalent problem with the box and balls, some paint could have smeared off the freshly painted black balls and give rise to "blackenwhited "balls a haha.
 
  • Haha
  • Like
Likes chwala and paulb203
  • #21
pbuk said:
One important concept (or tool if you like) that is used here is the (unstated) assumption that the sample of 60 ponies is an unbiased sample.
I reckon, that's indeed the main issue here, that the probability to find tagged ponies in the sample is about the same as the probability to find tagged ponies in the entire population, which equality would mean that the sample is unbiased (with respect to the tagged property) indeed.
 
  • Like
Likes paulb203
  • #22
pbuk said:
I would guess that the OP is at the early stages of learning some statistics: how does you having fun help him here?
Because these assumptions are the heart of this whole subject. It's a bit like teaching programming and saying don't bother with a test plan. I see this sort of teaching as anti-thinking. Just take the numbers you're given and manipulate them in the simplest way.

That's not what statistics should be about. Maths teaching should be better than this.
 
  • #23
PeroK said:
Have you enrolled in a statistical methods course without realising it?
Thanks, PeroK.

And no :), it was just one of a variety of questions in my latest GCSE (UK) assessments.
 
  • Like
Likes pbuk
  • #24
256bits said:
Problem does not state any projected 'variations' of the sample nor the population so one has to assume that the pony catcher was relying on homogeneity, rather than heterogeneity.

Question for you ( and all ).
Sampling - in this case all of the 60 ponies were collected, penned, and then counted, and all 60 released.
What if the pony catcher had collected one pony at a time, done his categorical counting, and then released to the wild population.
Would this have effected the statistical analysis?
Normally one assumes that the population is much larger than the sample. In this case it seems to be about 10%.
The Pony Catcher. Coming Soon, on Amazon Prime.
 
  • Like
Likes 256bits
  • #25
Delta2 said:
That was one of the reasons I wrote what i wrote at post #5. Although in my equivalent problem with the box and balls, some paint could have smeared off the freshly painted black balls and give rise to "blackenwhited "balls a haha.
Blackenwhited balls. Can you get cream for that?
 
  • #26
Paulb203:
Hope you dont mind I elaborate on the case of the German Tank Problem, first setting some bsckground:
In Estimation theory, you consider estimators( for different statistics ; mean, etc)with different properties: biased/unbiased ( whether the expected value of the estimator equals the statistic it estimates), minimum variance, maximum likelihood, etc.
In the German Tank Problem above, the issue is to provide an estimate for the Maximum of a discrete distribution . One estimate is based on standard Frequentist theory, the other method is based on Bayesian . In this problem, in ww2, allies wanted to estimate the number of tanks that Germans were producing, based on serial numbers in tanks found that had been abandoned by the Germans.
Numbers were assumed to be assigned in numerical order, starting with 1. 4 tanks were found. The estimate of the maximum serial number ( thus an estimate of the number of tanks), was given by the largest serial number plus the average of the gaps between the other serial numbers. Say the numbers found were 13, 19, 42, 60. Then the max number found was 60, and the average gap was ##\frac{5+22+17}{3}=\frac{44}{3}##

Thus the estimate of the maximum was 60+44/3 =( approx) 75 tanks. This was based on frequentist theory.
But from your setup, there are differences to consider before applying this approach.
Maybe @Dale can verify or knows about the Bayesian approach?
 
Last edited:
  • Like
Likes paulb203
  • #27
One common real-world bias that the original problem might have is the "self-selection" bias. Suppose it is only the slowest horses that tend to be caught in both the first and second group of horses. Then there might be a lot of fast horses that were never subject to the tagging and the estimate might tend to underestimate the total number of horses. The "self-selection" bias is very common. To avoid this possibility in the problem statement, it should be stated that all horses are equally likely to be caught.
 
  • Like
Likes PeroK
  • #28
FactChecker said:
To avoid this possibility in the problem statement, it should be stated that all horses are equally likely to be caught.
Rather than stating assumptions in the question, statistics is often examined by asking students to explain them in follow-on questions; see for example question 15 in this specimen GCSE Statistics paper published by AQA:

Kirstie is estimating the population of fish in a lake.
She catches some fish and marks them with an[sic] harmless dye.
She then returns them to the lake.
One week later she catches a smaller sample of 50 fish and sees that 6 of them are marked.
She correctly estimates there are 1125 fish in the lake

(a) How many fish did she originally mark?

(b) (i) State two assumptions Kirstie makes to ensure this process is valid.
(b) (ii) Evaluate one of these assumptions; stating clearly which one it is.
 
  • Like
Likes FactChecker and WWGD
  • #29
pbuk said:
Rather than stating assumptions in the question, statistics is often examined by asking students to explain them in follow-on questions; see for example question 15 in this specimen GCSE Statistics paper published by AQA:
I really like that. I think that they should be shown some good examples of how to state the problem before they are asked this.
 

FAQ: Work out an estimate for the total number of ponies in the forest

How can I estimate the total number of ponies in the forest?

You can estimate the total number of ponies in the forest by using sampling methods, such as the mark-recapture technique. This involves capturing a sample of ponies, marking them, releasing them back into the forest, and then capturing another sample to see how many marked ponies are in the second sample.

What factors should I consider when estimating the pony population?

Factors to consider include the size of the forest, the availability of food and water, the presence of predators, and the birth and death rates of the ponies. These factors can influence the population density and distribution of ponies in the forest.

How accurate are population estimates using sampling methods?

Population estimates using sampling methods can be quite accurate if the sampling is done correctly and the assumptions of the method are met. However, there is always some degree of uncertainty, and estimates can be affected by factors such as sampling bias, the behavior of the ponies, and environmental changes.

Can technology help in estimating the number of ponies in the forest?

Yes, technology can significantly aid in estimating pony populations. Tools such as drones, camera traps, and GPS collars can provide valuable data on pony movements and densities. Remote sensing and geographic information systems (GIS) can also help in mapping and analyzing the habitat to better understand the distribution of ponies.

What are some common challenges in estimating pony populations in a forest?

Common challenges include dense vegetation that makes it difficult to spot ponies, the mobility and elusive nature of ponies, and environmental factors that can affect their visibility and behavior. Additionally, ensuring that samples are representative of the entire population can be difficult in a large and diverse forest.

Similar threads

Replies
1
Views
19K
Replies
2
Views
1K
Replies
16
Views
3K
Replies
23
Views
6K
2
Replies
67
Views
12K
Replies
37
Views
13K
Back
Top