Fİnding best mpi parameters for the MCMC (Markov-Chain Monte Carlo)

In summary, for this particular scenario, using the `mpirun --use-hwthread-cpus cobaya-run test.yaml` command should be enough to fully utilize the computer's specs for the MCMC task in WSL2/Ubuntu.
  • #1
Arman777
Insights Author
Gold Member
2,168
193
I am using WSL2/Ubuntu 20.04 on Windows 10 to run a specific cosmological parameter estimation program called Cobaya. (It only runs on Linux and Mac)

Here are my specs.

Code:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           142
Model name:                      Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
Stepping:                        9
CPU MHz:                         2711.999
BogoMIPS:                        5423.99
Hypervisor vendor:               Microsoft
Virtualization type:             full
Cobaya uses the MCMC method for parameter estimation. I wonder what can be the best/optimum `mpirun ...` command to fully utilize my computer's specs for the MCMC task in WSL2/Ubuntu.

I have searched a bit, and choosing the correct `mpirun` arguments can impact the time it runs. Since my computer is nearly garbage for data analysis, I need to select the right ones.

Here is the page of the OpenMPI (https://www.open-mpi.org/doc/v3.0/man1/mpirun.1.php)

Normally I am using, `mpirun -n 2 cobaya-run test.yaml`. Even it seems I have 4 cores. I can only use 2. I guess to use all 4 cores. I can use `mpirun --use-hwthread-cpus cobaya-run test.yaml`.

Should I make some changes as well or just using `mpirun --use-hwthread-cpus cobaya-run test.yaml` will be enough ?
 
Last edited:
Technology news on Phys.org
  • #2
In general, it is difficult to give a specific answer for this question as the optimal MPI command for parameter estimation can depend on multiple factors. However, in this specific case, using the command `mpirun --use-hwthread-cpus cobaya-run test.yaml` should be sufficient to take advantage of all 4 cores. It is also possible to specify different arguments in the mpirun command such as the number of processes and threads, but these arguments can differ depending on the specific application being used. Therefore, it is best to consult the documentation of the application in order to determine which arguments are best suited for the task at hand.
 

FAQ: Fİnding best mpi parameters for the MCMC (Markov-Chain Monte Carlo)

What is MCMC and why is it important?

MCMC stands for Markov-Chain Monte Carlo and it is a computational method used to generate samples from a probability distribution. It is important because it allows for efficient exploration of high-dimensional spaces and is commonly used in Bayesian statistics and machine learning.

What are the key parameters to consider when using MCMC?

The key parameters to consider when using MCMC are the number of iterations, the burn-in period, the proposal distribution, the acceptance rate, and the convergence criteria. These parameters can greatly affect the performance and accuracy of the MCMC algorithm.

How do I determine the best number of iterations for MCMC?

The number of iterations depends on the complexity of the problem and the desired accuracy. It is usually recommended to run the MCMC algorithm for at least 10,000 iterations and check for convergence. If the desired accuracy is not achieved, the number of iterations can be increased.

What is the burn-in period and how do I determine it?

The burn-in period is the number of initial iterations that are discarded to allow the MCMC algorithm to reach the stationary distribution. It is important to choose a burn-in period that is large enough to ensure convergence, but not too large that it wastes computational time. A common rule of thumb is to discard the first 25% of iterations as burn-in.

How do I evaluate the performance of MCMC and choose the best parameters?

The performance of MCMC can be evaluated by examining the trace plots, autocorrelation plots, and convergence diagnostics. It is important to choose parameters that result in a low autocorrelation and fast convergence to the stationary distribution. It is also recommended to run multiple MCMC chains with different parameters and compare their results to ensure accuracy.

Similar threads

Back
Top