Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Whenever there is something that will inhibit vectorization, it's often best to fission the loop and isolate that operation by itself, and group the majority of the work together in loops that have work that all vectorizes.

Miscellaneous

Array Allocation

Modern languages can allocate memory for arrays in the "heap" and "stack".  Global arrays (or arrays with the "save" attribute) are allocated in the heap when the model initializes and these exist (taking up memory) for the entire simulation.  Arrays can be allocated while the code is running with the allocate() and deallocate() statements - these arrays are also placed in the heap, but the allocate() statement requires a system call and takes significant time (tens of thousands of instructions) as the system looks for available memory in the heap.  Allocate can be even slower when running in a threaded region - as it usually requires thread synchronization.  Allocate statement should never be used inside frequently called subroutines.  

Automatic arrays on the stack:  The most efficient way to create temporary arrays is to use local arrays declared in the subroutine.  These arrays will be placed on the stack (although sometimes compiler options are needed to force larger automatic arrays to be placed on the stack).  To create an array on the stack is nearly instant, as it just requires incrementing the stack pointer.   One drawback of stack variables is that each openMP thread must have it's own stack, and this memory must be allocated ahead of time, with the OMP_STACKSIZE environment variable.  Making this too small and the code will segfault when it runs out of stack memory, and making it too large, then allocating all this stacksize when the code starts can leave insufficient memory for the rest of the model.  

Recommendation:

  1. Allocate dynamic arrays only at a high level and infrequently
  2. use stack arrays or for all subroutine array temporaries
  3. understand the size of these arrays in order to have a good estimate of the needed stacksize
  4. See below for similar problems with array slicing causing similar problems

Array Slicing

In my opinion, there are very few cases when array slicing is wise to do. I know it's a convenient feature of Fortran, but it's the cause of some of our worst performance degradations. The only time I think array slicing is OK is when you're moving a contiguous chunk of data, not computing things. For instance:

...