Question about data analysis and error

In summary, the problem is to find ##x## and ##\delta x## so that ##c \leq \delta c##. The solution is to use the histogram of ##x## values to calculate mean and standard deviation.
  • #1
Arman777
Insights Author
Gold Member
2,168
193
Let us suppose we have one constant variable ##b \pm \delta b = 20 \pm 1 ## and one function that depends on ##x ##, such as, ##a(x) \pm \delta a ##

The problem is I want the difference between ##a(x) ## and ##b ## to be ##0 ##. Let me denote this difference as ##c \pm \delta c ##. To obtain a difference which is ##0 ##, we can have only one condition ##c \leq \delta c ##.

We are also given a range of ##x ## values.

So the problem is something like this.

Take a range of ##x ## values (such as ##{0,2} ##)

For each value of ##x ## in this range;

1) Calculate ##a(x) \pm \delta a ##

2) Take the difference between ##a(x) ## and ##b ##;
##c \pm \delta c = (a(x) \pm \delta a) - (b \pm \delta b) ##

3) if ##c \leq \delta c ## (if the difference can be ##0 ##), add it to an array that stores the values of the ##x ##.

Now after this we have some values of ##x ## that satisfy ##c \leq \delta c ##. How can you find ##x ## and ##\delta x ## from this array
 
Last edited:
Physics news on Phys.org
  • #2
Some caveats are in order:

1) You want to be sure ##\Delta a## and ##\Delta b## are uncorrelated
2) You mean ##\ | c | \leq \Delta c\ ## of course.

And the 'for each ##x##' can lead to a wrong result. Numerical example:

a = 400 ± 40, b = 360 ± 20, so c = 40 ± 45

And let a = 200 x ##\Rightarrow ## x = 2.0 ± 0.2

Your array of x is, for example, 1.78, 1.79, ... 2.22.
Then the average is 2.0 and the standard deviation 0.13

Reason: your ##x## has a flat distribution instead of a normal distribution##\ ##
 
  • #3
BvU said:
1) You want to be sure Δa and Δb are uncorrelated
2) You mean |c|≤Δc of course.
Yes
BvU said:
Reason: your x has a flat distribution instead of a normal distribution
Yes I have noticed that...that seems like a trouble
BvU said:
And the 'for each x' can lead to a wrong result. Numerical example:

a = 400 ± 40, b = 360 ± 20, so c = 40 ± 45

And let a = 200 x ⇒ x = 2.0 ± 0.2

Your array of x is, for example, 1.78, 1.79, ... 2.22.
I mean a different way actually. Think like this,

You have a some range of ##x## values from ##0## to ##2##. Then you are doing a loop such that
for each value of ##x##:
$c \pm \delta c = (a(x) \pm \delta a) - (b \pm \delta b)$
if $|c|≤Δc$:
add x to an array
 
  • #4
I have revised the problem
 
Last edited:
  • #5
Arman777 said:
I have revised the problem
Makes my reply quasi-incomprehensible :cry: !

Even so: some more editing is in order :wink:
 
  • Like
Likes Arman777
  • #6
BvU said:
Makes my reply quasi-incomprehensible :cry: !

Even so: some more editing is in order :wink:
Waiting
 
  • #7
Waiting for ? Ah - ##\TeX## details adjusted.
Arman777 said:
Then you are doing a loop such that
That doesn't tell us what the probability distribution of ##x## is going to be.
 
  • #8
Arman777 said:
I have revised the problem

Please don't do that. Because...

BvU said:
Makes my reply quasi-incomprehensible :cry: !
 
  • #9
Vanadium 50 said:
Please don't do that. Because...
sorry about that
 
  • #10
BvU said:
Waiting for ? Ah - ##\TeX## details adjusted.
That doesn't tell us what the probability distribution of ##x## is going to be.
Well I have printed the values and they were normally distributed
 
  • #11
So how can that be ? Pure luck, or something else, like the dependency of ##a## on ##x## ?
Or some monte carlo principle ?
 
  • #12
BvU said:
So how can that be ? Pure luck, or something else, like the dependency of ##a## on ##x## ?
Or some monte carlo principle ?
Its actually has the similar idea as your. The relationship between ##x## and ##a## is linear. So as we increase ##x## at some point it hits ##-\delta c## and another point it hits ##+\delta c##. The problem how can I calculate ##x## and ##\delta x## from the given data list.

Figure_1.png


This is the histpgram of the ##x## values.
 
  • #13
Looks pretty uniform to me :smile:
 
  • #14
BvU said:
Looks pretty uniform to me :smile:
So mean(x_values) and std(x_values) are the correct solution ?
 
  • #15
According to the numerical example, mean would be ok, st dev would be too optimistic
 
  • #17
Yes, I know. In the example I started with 2##\pm##0.2. The flat distribution gave a sigma of 0.13, about a factor of ##\sqrt 3## lower. Your procedure cuts at ##\pm\sigma## -- whereas a gauss distribution has , what was it again, 68% within ##\pm\sigma## and the remainder outside, contributing happily to the standard deviation.
 

FAQ: Question about data analysis and error

What is data analysis?

Data analysis is the process of examining, cleaning, organizing, and interpreting data to uncover patterns, trends, and insights. It involves using statistical and mathematical techniques to analyze large sets of data and draw meaningful conclusions.

Why is data analysis important?

Data analysis is important because it allows us to make informed decisions based on evidence and facts rather than assumptions or intuition. It also helps us identify areas for improvement, make predictions, and identify potential problems before they arise.

What is the difference between descriptive and inferential statistics?

Descriptive statistics summarize and describe the characteristics of a dataset, while inferential statistics make inferences and predictions about a larger population based on a sample of data. Descriptive statistics are useful for understanding the data at hand, while inferential statistics help us make broader conclusions.

What is the role of error in data analysis?

Error is an inevitable part of data analysis and can arise from various sources such as measurement, sampling, or human error. It is important to understand and account for error in data analysis to ensure the accuracy and validity of our conclusions.

How can we minimize error in data analysis?

We can minimize error in data analysis by using appropriate data collection methods, carefully selecting and preparing the data, using reliable and validated measurement tools, and performing thorough quality checks throughout the analysis process. It is also important to acknowledge and report any limitations or sources of error in our analysis.

Similar threads

Replies
1
Views
3K
Replies
13
Views
3K
Replies
2
Views
647
Replies
8
Views
480
Replies
6
Views
4K
Replies
1
Views
1K
Replies
3
Views
1K
Back
Top