How Does the Central Limit Theorem Estimate Error in Area Calculations?

In summary, We discuss the estimation of object A in the unit square using a sequence of uniformly distributed random points. By considering Y as a Bernoulli random variable with probability A, we can estimate A using the formula A ≈ (1/n)∑Y_i. We can also use the Central Limit Theorem to gauge the probable size of error in the estimate, but the calculation for n is not clear.
  • #1
nolita_day
3
0
Your help/input is much appreciated! Thanks in advance! :smile:

Object A is in the unit square 0<x<1, 0<y<1. Consider a random point distributed uniformly over the square; let Y=1 if the point lies inside A and Y=0 otherwise. How could A be estimated from a sequence of n individual points unif. distr. on the square?

I got this part just taking Y as ~Bernoulli(p=A). Let [itex]Y_{i} = f(X_{i})[/itex], then [itex]A≈\frac{1}{n}\sum{Y_{i}}[/itex]

How could the CLT be used to gauge the probable size of the error of the estimate? Denoting the estimate by [itex]\hat{A}[/itex], if [itex]A=0.2[/itex], how large should n be so that [itex]P(|\hat{A} - A| < .01) ≈ .99[/itex]?

I can't figure out this one. Am I supposed to take [itex]\hat{A}[/itex] as the random variable or [itex]|\hat{A}-A|[/itex] as the RV? What is the standard deviation of [itex]\hat{A}[/itex] vs. [itex]|\hat{A}-A|[/itex]? My attempt below is only for the first part of the question in the previous paragraph.

Taking [itex]\hat{A}[/itex] as the RV, I started out like this...

[itex]E[\dfrac{Y_{1}+...+Y_{n}}{n}]=\hat{A}[/itex]
Then [itex]Var(\hat{A})=Var( E[\dfrac{Y_{1}+...+Y_{n}}{n}] )=\frac{1}{n^{2}}Var( nE[Y_{i}] ) = Var( E[Y_{i}] ) = Var(A)[/itex] (since Yi ~ Bernoulli(A) = A)

To estimate [itex]A[/itex] based on [itex]\hat{A}[/itex], pick a small error and a high probability of getting that small error and you could use CLT as follows:

[itex]P(|\hat{A} - A| < .01) = .99[/itex]
[itex]P(-.01 < \hat{A} - A < .01) = .99[/itex]
[itex]P(\dfrac{-.01}{σ_{\hat{A}}} < \dfrac{\hat{A} - A}{σ_{\hat{A}}} < \dfrac{.01}{σ_{\hat{A}}}) = .99[/itex]
[itex]P(\dfrac{-.01}{\sqrt{A}} < Z < \dfrac{.01}{\sqrt{A}}) = .99[/itex]
[itex]\Phi(\dfrac{.01}{\sqrt{A}}) - \Phi(\dfrac{-.01}{\sqrt{A}}) = .99[/itex]
[itex]2\Phi(\dfrac{.01}{\sqrt{A}}) - 1) = .99[/itex]
[itex]\dfrac{.01}{\sqrt{A}}= 2.58[/itex]
[itex]A = .000015023?[/itex]

I'm definitely sure this is wrong... how the heck did I come up with an actual number for A?
 
Physics news on Phys.org
  • #2
This is a bit hard to follow because you confuse A with A-hat.
[itex]\hat{A} = \sum Y_i/n, A = E(\hat{A}[/itex]).
 

FAQ: How Does the Central Limit Theorem Estimate Error in Area Calculations?

1. What is the Central Limit Theorem (CLT)?

The Central Limit Theorem (CLT) is a statistical concept that states that the sampling distribution of the mean of a large enough sample size will be approximately normally distributed, regardless of the underlying population distribution.

2. How is the CLT used in the estimation of area?

The CLT is used in the estimation of area by allowing us to make inferences about a population based on a sample. It helps us to calculate the confidence interval, which is a range of values that is likely to contain the true population mean with a certain level of confidence.

3. Why is the CLT important in statistics?

The CLT is important in statistics because it allows us to make accurate inferences about a population based on a sample. It also helps us to understand the behavior of sample means and the variability of sample statistics.

4. What are the assumptions of using the CLT?

The assumptions of using the CLT include: 1) the sample is randomly selected from the population, 2) the sample size is large enough (usually >30), and 3) the observations in the sample are independent of each other.

5. Can the CLT be used for any type of data?

The CLT can be used for any type of data as long as the assumptions are met. However, it is most commonly used for continuous data that follows a normal distribution.

Similar threads

Back
Top