Random Seed Choice for LAMMPS Molecular Dynamics Simulations

In summary: I believe that an early random number generator in the IBM library had a serious problem.Yes, that's what my friend told me. He worked on it in the early 1970s.
  • #1
person123
328
52
TL;DR Summary
How does the random seed influence the random value?
I want to create multiple molecular dynamics simulations using LAMMPS which are different only in the initial velocity of the atoms. LAMMPS allows you to use a random seed to generate an initial velocity. I plan to just use successive numbers, so if there are 5 simulations, the seeds would be 1,2,3,4, and 5.

I just want to make sure that the random values a seed returns isn't influenced by how large the number is. For example, would the results from a random seed of 2 be any more likely to be more similar to 3 than say 548? Would just choosing successive integers for the random seed be an acceptable way to generate random values?
 
Technology news on Phys.org
  • #2
There should be no problem with the method you suggest unless the documentation specifically tells you to avoid certain values. The fact that the seeds are close together should not cause the random results to be related. That is one important property of a good random number generator.
 
  • Like
Likes Timo and person123
  • #3
Got it. Thank you! Yes, there's nothing in the documentation about avoiding certain numbers.
 
  • #4
One thing about choosing seeds is that using the same value will give you the same sequence of values. THis is useful when trying to debug a problem. However, it may not be what you want when you demo it to someone. Often, programmers will seed with the current time value.

You could generate a csv file with each column representing a different seed and analyze the data via excel to see how they vary. Each row would be a new call.

Not all random number generators are created equal so it would be good to search for any problems associated with your particular generator to be safe.

Years ago, a programmer who specialized in random number generators told me of one such generator that on the surface appeared random but when data points were grouped into triplets and plotted, they were guaranteed to be a point on a plane in 3-space ie given a few points in the sequence and you could predict the next one.

https://en.wikipedia.org/wiki/Random_number_generation
 
  • Informative
Likes Klystron
  • #5
I would pick large odd numbers. This is a little bit "lore", and certainly doesn't apply to many RNGs. For example, if x is your sequence number 10x + 1 + 1000000 might be a better way to do it.
 
  • Like
  • Informative
Likes person123 and FactChecker
  • #6
Vanadium 50 said:
I would pick large odd numbers. This is a little bit "lore", and certainly doesn't apply to many RNGs. For example, if x is your sequence number 10x + 1 + 1000000 might be a better way to do it.
That is what I always thought from old texts (Knuth?), but I haven't seen that mentioned for a long time.
 
  • #7
jedishrfu said:
Years ago, a programmer who specialized in random number generators told me of one such generator that on the surface appeared random but when data points were grouped into triplets and plotted, they were guaranteed to be a point on a plane in 3-space ie given a few points in the sequence and you could predict the next one.

https://en.wikipedia.org/wiki/Random_number_generation
I believe that an early random number generator in the IBM library had a serious problem. Even later versions would show some surprising patterns if advanced time series analysis is used. It is my conclusion that the current random number generators are satisfactory for the vast majority of uses, but if you apply very sophisticated analysis, there may be issues. On the other hand, I do not think that most computer models are so accurate that the random number generator would be the weakest link.
 
  • #8
FactChecker said:
I believe that an early random number generator in the IBM library had a serious problem.

You may be remembering (or misrembering) RANDU.

1611860823832.png


Anyway, "large and odd" may well be unnecessary, but it sure doesn't hurt anything.
 
  • Like
Likes jim mcnamara, Klystron and FactChecker
  • #9
Vanadium 50 said:
You may be remembering (or misrembering) RANDU.

View attachment 276982

Anyway, "large and odd" may well be unnecessary, but it sure doesn't hurt anything.
Yes. That was it. I didn't want to name it because I am not sure that modern versions with the same name still have that problem.
 
Last edited:
  • #10
Nice, that's probably what my friend was referring to when he told me this story in 1975-76 before he retired from GE.

He was the resident math library programmer for the GE computing center and he did his work in Fortran IV and Honeywell Fortran-Y a precursor to Fortran-77.

Although I was under the impression it was a single plane, multiple planes make more sense.

https://en.wikipedia.org/wiki/RANDU
 
  • #11
RANDU sounds like a villain on Star Trek.
 
  • #12
Even the old RANDU was perfectly sufficient for the vast majority of real-world Monte Carlo simulation applications. Its flaws were unlikely to be significant in anything except academic exercises.In general, there are much more serious issues to worry about.
 
  • #14
FactChecker said:
Its flaws were unlikely to be significant in anything except academic exercises.

I disagree. It's fine in 1D and 2D, but if you are "randomly"picking points or directions in 3D space, you don't have complete coverage. Of course, this is only a problem if you live in a universe with three spatial dimensions.
 
  • #15
Vanadium 50 said:
I disagree. It's fine in 1D and 2D, but if you are "randomly"picking points or directions in 3D space, you don't have complete coverage. Of course, this is only a problem if you live in a universe with three spatial dimensions.
"Complete coverage" is rarely achieved in a Monte Carlo simulation. The nature of the imperfections that you are pointing out is no greater than many Monte Carlo simulations with a limited number of samples. I have personally never been suspicious of the random number generator results except in academic examples where time series analysis was applied. I admit that there are disciplines where it is critical, but they are far fewer than the "universe with three spatial dimensions". In fact, many Monte Carlo simulations have hundreds, even thousands, of dimensions, and approaching full coverage is completely impractical.
 
  • #16
Vanadium 50 said:
I would pick large odd numbers. This is a little bit "lore", and certainly doesn't apply to many RNGs. For example, if x is your sequence number 10x + 1 + 1000000 might be a better way to do it.
I think for my application, I don't need to worry too much about these issues with patterns in the random numbers as some of you suggested, but I don't see any reason not to be safe and use an approach like this instead.
 
  • #18
If you want a random distribution that uniformly populates x, y and z, I would say that this

1611877002456.png


is not it.
 
Last edited:
  • Wow
  • Like
Likes FactChecker and berkeman
  • #19
jedishrfu said:
Sometimes picking prime numbers is a safer bet since you're doing a lot of modulo math when generating the random number.
True and accurate. Recall evaluating commercial, government (NASA) and academic produced random number generators back in the 1990's. Selecting twin prime numbers as seeds in subsequent runs produced interesting results in certain cases; though I disremember the correlations.

Selecting a robust prime as seed should provide sufficient independence among software runs.

[Edit: Twin primes are prime numbers separated by 2. Prime numbers are natural numbers >1 with factors limited to 1 and itself.]
 
Last edited:
  • #20
Vanadium 50 said:
If you want a random distribution that uniformly populates x, y and z, I would say that this

View attachment 277002

is not it.
That is a good point. I doubt that any of the standard library random number generators use that any more.
 
  • #21
From folks at Sandia lab (who maintain the code):
LAMMPS uses its own random number generator which has the following properties:
If you use 1,2,3,4,5 over and over you get the the same sets of velocities over and over.
For modeling without the "over and over" effect they suggest a cut and paste from a www site like
http://www.random.org/integers
which claims true randomness.

https://lammps.sandia.gov/threads/msg10852.html

So I would suspect that the RNG in the program is a PRNG which means it is cyclic. That is: After generating a very long sequence of numbers, it comes back to where it started and repeats that same sequence. And it also generates the exact same sequence each time you feed it a fixed number like 5.

Just do what they suggest:smile:
 
  • Like
Likes person123
  • #22
Got it. I will use that approach of using random numbers as the seed.
 

FAQ: Random Seed Choice for LAMMPS Molecular Dynamics Simulations

What is a random seed in LAMMPS molecular dynamics simulations?

A random seed is a starting point or initial value used in LAMMPS simulations to generate pseudorandom numbers. These numbers are used to introduce randomness into the simulation and can affect the outcome of the simulation.

Why is it important to choose a random seed in LAMMPS simulations?

Choosing a random seed is important because it helps to ensure that the simulation results are not biased or dependent on a specific starting point. This allows for more accurate and reliable simulations.

How is a random seed chosen in LAMMPS simulations?

A random seed in LAMMPS simulations can be chosen manually by the user or automatically by the software. If chosen manually, the user can input a specific number or use a predefined seed. If chosen automatically, the software will generate a seed based on the system's clock or other factors.

Can the same random seed be used for different LAMMPS simulations?

Yes, the same random seed can be used for different LAMMPS simulations. However, this may result in similar or identical simulation outcomes, which may not accurately represent the system being studied. It is generally recommended to use a different seed for each simulation.

Are there any best practices for choosing a random seed in LAMMPS simulations?

Yes, there are some best practices for choosing a random seed in LAMMPS simulations. These include using a different seed for each simulation, avoiding using predefined seeds, and ensuring the seed is truly random to avoid bias in the simulation results. Additionally, it is important to document the chosen seed for reproducibility purposes.

Back
Top