[Fortran90]Problem with Gaussian distribution

In summary: I don't know how to generate a uniform random number between -0.1 and 0.1. So, the second way I tried was to use the following code do i=1,n j=1,n/2
  • #1
Aner
9
0
Hi,
I have a problem in my program and I cannot figure it out.
In the last post I had a problem about some arrays, I perfectly resolved it thanks to you, but now my problem is a little bit subtle. I have a subroutine(here I'll post it has a program )that generates random numbers in order to create errors so that those must have a Gaussian distribution and it is perfect, the program works perfectly, I can have a really nice Gaussian until I centred my points (that is what I must do to associated the errors to my data). After I centred the points I get a weird accumulation of point near x=1 that I can't explain.
Here is the program I'm talking about
Fortran:
Program gauss
implicit none

character(len=6)::gaussn
real,dimension(10000)::m,a,ltot,pron
REAL::x,l
INTEGER::i,n,k,p,j,fn=10000

print*,"scrivere il nome del file"

read*,gaussn   OPEN(UNIT=1,FILE=gaussn)

do f=1,n

read(1,*)m(f)

pron(f)=2*0.1*(m(f)-0.5)
end do
close (unit=1)a=0

l=0.1/700do i=1,np=int(pron(i)/l)

a(p+(n/2+1))=a(p+(n/2+1))+1
!a(p)=a(p)+1
end do

do j=1,n
ltot(j)=-(n/2.0)+j

write(unit=15,fmt=*)ltot(j),a(j)
end do

end program gauss
The file that it reads is a column of data that I generate in my main program. I printed two files, one with the already centred points and one with the not centred ones. Now I would expect those two sets of points to give a Gaussian distribution but a you can see those are pretty different.
This pron(f)=2*0.1*(m(f)-0.5) is the line that I use to centre my points, it's something that we did in class with the teacher. I even thought that maybe it was the minus sign that was bothering fortran and I did some experiments like changing it with a plus and defining m(f) as negative. In the main program if I use the centred points I have a "lack" of precision, if I expect my output to be -3 I'll get -2.993... with an error of order 10-7 when with the not centred ones I get -3.001... with an error of order 10 -3.
I can't understand the problem if there is one or it is just me, maybe I'm missing something.
Thanks in advanced.
 

Attachments

  • gaussnotc.png
    gaussnotc.png
    4.1 KB · Views: 720
  • gausscentred.png
    gausscentred.png
    2.3 KB · Views: 666
Technology news on Phys.org
  • #2
I don't quite get what you are trying to do...can you try again to better explain what you mean by "centering" the data? In which direction? x? y?

f is just the array index, that's fine. But if m(f) is the y value of your data, what does "-0.5" had to do with centering? It is no like the maximum value is 1.0! If you want to center the data in the y direction (and allow negative values) then, you need to subtract half the maximum value from all data points, not just "half" (0.5). You maximum value is over 200...I don't see how taking 0.5 away from it centers it. In the same line, you divide by 10...are you scaling everything down? and then multiply by 2...scaling again?

In the following loop you take pron and multiply by 7000! what is that about? Then, you take these function values and mix them up with the index variable n, in effect, you are kind of transposing the whole thing using function values as your index...is this on purpose? I am confused.
 
  • #3
gsal said:
I don't quite get what you are trying to do...can you try again to better explain what you mean by "centering" the data? In which direction? x? y?

f is just the array index, that's fine. But if m(f) is the y value of your data, what does "-0.5" had to do with centering? It is no like the maximum value is 1.0! If you want to center the data in the y direction (and allow negative values) then, you need to subtract half the maximum value from all data points, not just "half" (0.5). You maximum value is over 200...I don't see how taking 0.5 away from it centers it. In the same line, you divide by 10...are you scaling everything down? and then multiply by 2...scaling again?

In the following loop you take pron and multiply by 7000! what is that about? Then, you take these function values and mix them up with the index variable n, in effect, you are kind of transposing the whole thing using function values as your index...is this on purpose? I am confused.
I am sorry for not being clear enough. In the program I posted, m(f) are random numbers with gaussian distribution. In reality I tried to obtain random numbers with a gaussian distribution centred at the origin in two ways, but I always run into the same problem. My first attempt was to compute random numbers with uniform distribution between -0.1 and 0.1 and then to use them in order to create random numbers k(f) with a gaussian distribution centred at the origin. To compute them I used the following code
Fortran:
do i=1,n

       call random_number(x(i))

end do

k(f)=sum(x)/n

where the values of x(i) are random numbers between 0 and 1. The problem is that I would like to have a gaussian distribution centred at 0 and not at 1/2, which should be the centre of the gaussian distribution generated above because it is the distribution of the mean values of random numbers with expectation value 1/2, if I'm not wrong. Moreover, even though the fortran function random_number(x) generates random numbers between 0 and 1, I would like to have them between -0.1 and 0.1. So I needed to multiply the random numbers generated by random_number(x) by 2*0.1, which is the maximum width, and after that I needed to subtract half of the maximum value, which is 0.1, so that my random numbers should be of the form randomnumber=2*0.1*(x-0.5) where x is generated with random_number(x). But in doing so I ran into the problem I described above, because if I only plotted my gaussian distribution obtained with the code above I had no problem, but if I plotted my k(f) calculated with the following code
Fortran:
do i=1,n

       call random_number(x(i)) 
       randomnumber=2*0.1*(x(i)-0.5)

end do

k(f)=sum(randomnumber)/n

i found that weird crazy point. So I tried a second attempt, which is the one I quoted in my first post. This time I computed my k(f) with the first code, not including the line randomnumber=2*0.1*(x(i)-0.5), and then I tried to shift these numbers with gaussian distribution centered at 1/2 in order to obtain random numbers with a gaussian distribution centred at the origin. I thought that the same line could be used in order to shift the data because I still wanted a maximum width of 2*0.1 and still wanted the centre to be 0, but I am not sure of this. In the program above, the m(f) correspond to k(f). However I run again into exactly the same problem.
The second part of the program is used to make the gaussian distribution. I defined "l" to be the width of a single interval, that is I wrote l=0.1/h where h is the number of intervals I want between 0 and 0.1. After that I computed p=int(pron(i)/l). In this way I should be able to compute the integer part of pron(i)/l where pron(i) are my shifted random numbers of the second attempt. Doing so I know that the pth interval contains one of my random numbers, in this case pron(i) (rigorously speaking it should be the (p+1)th but I don't think it should change much), and to keep track of this I add 1 to the (p+n/2+1)th component of the array "a" that I defined to be a=0 before the do cycle. I added n/2+1 to the index of "a" so that, when p=0 I obtain that the random number is assigned to the (n/2+1)th component. Finally, to do the plot of the gaussian distributions, I defined an index ltot(j) to be ltot(j)=-(n/2.0)+j so that, when I plot the points (a(i), ltot(i)), I should obtain a gaussian distribution centered at the origin. That was what I thought, but obviously there is a problem somewhere. I hope to have been more clear!
 
  • #4
maybe you should google a bit more...you are not the first one trying to do this.

B the way, the manner to get random numbers within a range is: min + (max - min)*random_number(x) and not that way you are doing it...it is important that the multiplication (scaling) be done while all numbers are still positive as the scaling reduces the magnitude of the number towards zero.

Here is the page of some guy who is nice enough to share his experience; once in that page, search for the word "normal" (which is another name for gaussian distribution).
 
  • #5
gsal said:
B the way, the manner to get random numbers within a range is: min + (max - min)*random_number(x) and not that way you are doing it...it is important that the multiplication (scaling) be done while all numbers are still positive as the scaling reduces the magnitude of the number towards zero.

Here is the page of some guy who is nice enough to share his experience; once in that page, search for the word "normal" (which is another name for gaussian distribution).
I so strongly disagree. The page to which you referenced did things exactly wrong. (Prima facie evidence: That guy uses the original Numerical Recipes ran1. That's been known to be bad since sometime in the previous millennium. There is no reason to use something that isn't at least as good as Mersenne Twister nowadays.) In general, it's much better to use the following to convert a number randomly drawn from U(0,1) to a number randomly drawn from U(min,max) than your min+(max-min)*u_0_1_random_deviate:
Code:
r = u_0_1_random_deviate
u_min_max_random_deviate = max*r + min*(1.0-r)
 
  • #6
D H:
Well, I am no random number expert and was simply providing a link since the OP could use some examples. I don't exactly know what you mean by Numerical Recipes ran1...the ran1 function in the referenced page does nothing but call Fortran90 instrinsic function random_number().

By the way, the min+(max-min)*u_0_1_random formula is not my formula...in any case, it is exactly the same as yours.

Other than that, forgive me if I don't share your passion, but please keep cool and don't kill the messenger, I am just trying to help.
 
  • #7
gsal said:
By the way, the min+(max-min)*u_0_1_random formula is not my formula...in any case, it is exactly the same as yours.
If you had infinite precision, yes, they would be exactly the same. You don't have infinite precision. Your example uses double precision.

Here's a key problem with using min+(max-min)*u_0_1_random instead of my recommended approach, max*u_0_1_random+min*(1.0d0-u_0_1_random). Suppose u_0_1_random is exactly one. Your version reduces to min+(max-min). On the other hand, my version reduces to max in this case.

There is no guarantee that min+(max-min) is equal to max. There are in fact plenty of pairs of IEEE floating point numbers min and max such that min+(max-min) > max.
 
  • #8
I see, thanks for the explanation.
 
  • #9
Aner said:
To compute them I used the following code
Fortran:
do i=1,n

       call random_number(x(i))

end do

k(f)=sum(x)/n
Your k(f) does not have a gaussian distribution. It has a Bates distribution, which is close to normal thanks to the central limit theorem and thanks to special properties of the uniform distribution. (It takes a lot more trials for the sum of independent exponentially distributed random variables converge toward a normal distribution than it does for the sum of independent uniformly distributed random variables.)

You can use this as an approximation of a guassian distribution, but you need to account for two things:
- The mean is not zero (it is 1/2)
- The variance is not one (it is 1/(12*n)).

You would have something close to [itex]N(0,1)[/itex] (the standard normal distribution, with mean 0 and variances 1) if you had used k(f)=sqrt(12.0d0*n)*(sum(x)/n-0.5). Converting from [itex]N(0,1)[/itex] to [itex]N(\mu,\sigma)[/itex] is fairly straightforward: Multiply by sigma and add mu.
 

FAQ: [Fortran90]Problem with Gaussian distribution

What is the Gaussian distribution?

The Gaussian distribution, also known as the normal distribution, is a probability distribution that is commonly used to describe continuous data. It is characterized by a symmetrical bell-shaped curve, with the peak of the curve representing the mean of the data and the width of the curve representing the standard deviation.

What is the problem with the Gaussian distribution in Fortran90?

The main problem with the Gaussian distribution in Fortran90 is that it does not have a built-in function for generating random numbers that follow a Gaussian distribution. This can make it difficult to use the distribution in simulations or other applications.

How can I generate random numbers from a Gaussian distribution in Fortran90?

While Fortran90 does not have a built-in function for generating random numbers from a Gaussian distribution, there are several ways to do so. One method is to use the Box-Muller transform, which involves using two uniform random numbers to generate a pair of random numbers that follow a Gaussian distribution. Another option is to use a library or module that has already implemented a function for generating random Gaussian numbers.

Can I customize the parameters of the Gaussian distribution in Fortran90?

Yes, it is possible to customize the parameters of the Gaussian distribution in Fortran90. The mean and standard deviation can be adjusted to fit the specific needs of your application. Additionally, some libraries or modules may offer additional parameters or options for customizing the distribution.

Are there any alternatives to using the Gaussian distribution in Fortran90?

Yes, there are several alternatives to using the Gaussian distribution in Fortran90. Some other commonly used distributions include the uniform distribution, exponential distribution, and Poisson distribution. The choice of distribution will depend on the specific needs and characteristics of your data. It is important to research and understand the properties of different distributions before deciding on the most appropriate one for your application.

Similar threads

Replies
5
Views
4K
Replies
4
Views
2K
Replies
4
Views
2K
Replies
8
Views
1K
Replies
12
Views
1K
Replies
5
Views
1K
Replies
17
Views
5K
Back
Top