davidfur
				
				
			 
			
	
	
	
		
	
	
			
		
		
			
			
				
- 18
- 2
- TL;DR Summary
- I'm trying to parallelize a simple DO loop for the first time, but without success. 
 Need some basic help with minimal code sample.
Hey guys,
I've started to read some OpenMP programming and now I'm trying to parallelize small part of a fortran code.
The first thing I would like to do is to parallelize the innermost DO loop. It loops through the number of particles (na) and calculates
the distance between some point in 3D space (pos) and the particle's position (pos). At the end, the particle closest to the point should be identified.
	
	
	
    
	
		
When I compile the program and run it on 1 thread, the execution time is 13seconds (for the whole program). Then, running on 10 threads the execution time jumps to over a minute. Clearly, I have messed up somewhere, but my current understanding is still lacking...
Specifically, I would like to see the expected speed-up, and also make sure that all threads are aware of the most up-to-date shortestDis to compare to.
Can anybody guide me through this?
				
			I've started to read some OpenMP programming and now I'm trying to parallelize small part of a fortran code.
The first thing I would like to do is to parallelize the innermost DO loop. It loops through the number of particles (na) and calculates
the distance between some point in 3D space (pos) and the particle's position (pos). At the end, the particle closest to the point should be identified.
		Fortran:
	
	!$omp parallel do private(i1,imol2,atomDis) default(shared)
            do i1=1,na
              imol1=iag(aid,3+mbond)
              imol2=iag(i1,3+mbond)
              !write(*,*) 'atom ',aid,' belongs to mol: ',imol1
              !write(*,*) 'atom ',i1,'  belongs to mol: ',imol2
              ! perform analysis only on same molecule
              if (imol1 .NE. imol2) then
                  !write(*,*) 'cycle at atom ',i1
                 cycle
              endif
              call dista3(i1,pos,atomDis,dx,dy,dz)
              ! pos is already in angstrom.
              ! convert atomDis back to bohr.
              atomDis=atomDis/bohr2ang
              !write(*,*) 'atomDis=',atomDis
              if (atomDis < shortestDis) then
                 !write(*,*) 'closest atom is: ',i1
                 closestAtm = i1
                 shortestDis = atomDis
              endif
            enddo
!$omp end parallel doWhen I compile the program and run it on 1 thread, the execution time is 13seconds (for the whole program). Then, running on 10 threads the execution time jumps to over a minute. Clearly, I have messed up somewhere, but my current understanding is still lacking...
Specifically, I would like to see the expected speed-up, and also make sure that all threads are aware of the most up-to-date shortestDis to compare to.
Can anybody guide me through this?
 
 
		 
 
		