FORTRAN ?: Trying to bypass significant slowdown

  • Fortran
  • Thread starter blbelson
  • Start date
  • Tags
    Fortran
In summary: I would like to thank all of you for your help. It is greatly appreciated.In summary, your code is running out of registers, and if you can eliminate one or more of your loops, that might speed things up. For example, you can "unroll" the innermost loop by having a separate statement or block of statements for each of the values of your loop counter variable, d. The first one would be with d = 0, the second one with d = 5, and so on, up to 85.Thanks Mark. 1. You can try using the SUM array intrinsic.2. There is no way around four integrations, the integrations are not spheres but incident and reflected angles
  • #1
blbelson
4
0
I have experienced a serious program execution slow down and traced its source to one calculation...pseudo code below:

do a=0, 360, 10
do b=0,85,5
do c=0,360,10
do d=0,85,5
bunch of calculations involving double precision real variables (stored in variable ele)
tot=tot + ele
end do
end do
end do
end do

tot and ele are both double precision real variables. If I comment out "tot=tot+ele", the program takes 2m 04s to run, otherwise it takes 5m 55s to run.

I am using the ifort compiler with "-ipo -O3 -static -xP -no-prec-div -inline" flags set. Does anyone have an explanation as to why this occurs (compiler optimization issue for example) and if there is a way to prevent it?
 
Technology news on Phys.org
  • #2
Each statement in your innermost loop is running 17*36*17*36 times, or 374,544 times. If you can eliminate one or more of your loops, that might speed things up. For example, you can "unroll" the innermost loop by having a separate statement or block of statements for each of the values of your loop counter variable, d. The first one would be with d = 0, the second one with d = 5, and so on, up to 85.
 
  • #3
Thanks Mark 44, I believe the compiler simply skipped the loop when I commented out the line since the loop served no purpose in that instance. I was afraid I'd have to tackle the problem in the way you suggested above but I took a hybrid approach and wrote a subroutine since there are quite a few calculations in that block (realizing the performance boost may be less than writing each line of code in the main body). Doing so reduced program execution time to 4m 22s. Thanks for the kickstart...
 
  • #4
Calling a subroutine 374,544 times might be slower than having the calculations inline in the innermost loop. There is some overhead associated with a subroutine call and return. You might compare the execution times with a subroutine call vs. having the code inline in the inner loop.
 
  • #5
Perhaps you can move some of the calculations from the inner loop to the outward ones, and save intermediate results in temp variables? Although it is most likely already done by the compiler.
 
  • #6
This looks like integration over two spheres. Can you reformulate the problem to eliminate some integrations? For example, if relevant degrees of freedom only depend on the angle between two vectors, you can get away with only one loop instead of 4.

And you shouldn't integrate to 360, you should integrate to 360-step.

Also consider doing this in C.
 
Last edited:
  • #7
blbelson said:
If I comment out "tot=tot+ele", the program takes 2m 04s to run, otherwise it takes 5m 55s to run.
Your program probably ran out of registers to use, and perhaps the compiler didin't prioritize inner variable over outer variables. It should help to declare those loop counters as integer, which use a different set of registers.
 
  • #8
> Also consider doing this in C.

Why? C is not intrinsically any faster.
 
  • #9
Right C is not faster. As a first thought, make sure that your DO loops are in the right "order". You want your inside loop to be the first index in your array, e.g.
Code:
DO k=1,kmax
 DO j=1,jmax
  DO i=1,imax
    array(i,j,k) = something
  END DO
 END DO
END DO
Is much faster than having the DO loops in the opposite order.

Also, as a test (really not sure if it will be faster or not), you can try using the SUM array intrinsic, e.g.
Code:
DO
DO
DO 
DO
 elem = whatever
END DO
END DO
END DO
END DO

tot = SUM(elem)
 
  • #10
Appreciate all the responses. The portion of code in question was just one step in a very lengthy problem. I am quite pleased with the improvements you all have helped me achieve - over 50% faster on the one subroutine which was called over 12,000 times. The changes have moved my dissertation one step closer to completion, thanks!

I'll respond to each suggestion below:

1. I had already tried doing a "sum" outside the nest with no noticeable improvement in calculation time.
2. There is no way around four integrations, the integrations are not spheres but incident and reflected angles.
3. I sucked it up further and put the code in-line for comparison. It took 4m 22s just as it did with a subroutine call. I believe the compiler optimizations I chose may include making the subroutine in-line anyway. I returned to using the subroutine.
4. Great catch on the 360-step. That was a mistake that I did not catch. With that change, I am down to 4m 13s...and what's better, the calculation is now right!
5. Not sure what benefit "C" would have in this project. That change is too significant to test.
 
  • #11
There is no way around four integrations, the integrations are not spheres but incident and reflected angles.

Does the system possesses rotational symmetry? If it does, one integration over 0 to 360 can be eliminated.

Not sure what benefit "C" would have in this project. That change is too significant to test

Being a somewhat lower-level language, C is considerably faster than Fortran (up to 2-3 times, depending on task) and that makes it more suited for heavy numerical programming.

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=gcc&lang2=ifc&box=1
 
Last edited by a moderator:
  • #12
hamster143 said:
Being a somewhat lower-level language, C is considerably faster than Fortran.
It really depends on the compilier. Comparing a bad Fortran compiler versus a good C compiler isn't fair. In the case of some Cray supercomputers (like the nearly extinct X1 X1E series vector processing machines), the Fortran compliler is faster than C, partly because some vendor specific extensions were made to the Fortan language used on the Cray, and partly because that particular Fortran compiler optimizes very well on the Cray supercomputer. Newer Cray systems are supposed to combine Intel or AMD cpu's with specialized vector math units, but I don't know how many of these have been made.

I don't know what options there are in terms of Fortran compilers for PC based systems, and if which of these, if any, do a really good job of optimizing code, or if in this case, the required floating point calcuations simply can't be optimized beyond some basic level.
 
Last edited:
  • #13
hamster143 said:
Does the system possesses rotational symmetry? If it does, one integration over 0 to 360 can be eliminated.

The system is not rotationally symmetric. The integrations are over all possible incident and reflected angles.
QUOTE]
 

Related to FORTRAN ?: Trying to bypass significant slowdown

1. What is FORTRAN?

FORTRAN is a high-level programming language used primarily for scientific, engineering, and mathematical applications. It was one of the first programming languages and is still widely used today.

2. Why is FORTRAN still used in scientific computing?

FORTRAN is still used in scientific computing because it is highly efficient and has a long history of being used for complex mathematical calculations. It also has a large library of pre-written functions specifically designed for scientific applications.

3. How does FORTRAN handle numerical calculations?

FORTRAN is designed to handle numerical calculations efficiently, making use of its built-in functions and data types such as real and integer. It also has advanced features like arrays and subroutines, which can greatly improve the speed and accuracy of numerical calculations.

4. Can FORTRAN be used for other types of programming?

Yes, while FORTRAN is primarily used for scientific computing, it can also be used for other types of programming such as data processing, business applications, and even video games. However, other languages may be better suited for these types of applications.

5. How can I bypass significant slowdown in FORTRAN?

There are several ways to bypass significant slowdown in FORTRAN. These include optimizing your code by using efficient algorithms and data structures, making use of parallel processing techniques, and making use of compiler optimizations. It is also important to regularly review and update your code to ensure it is using the most efficient methods available.

Similar threads

  • Programming and Computer Science
Replies
4
Views
1K
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
Replies
8
Views
3K
  • Programming and Computer Science
Replies
4
Views
1K
  • Programming and Computer Science
Replies
4
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
7
Views
1K
  • Programming and Computer Science
Replies
2
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
4
Views
4K
  • Programming and Computer Science
Replies
6
Views
1K
Back
Top