Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Often times, data will be initialized by a single thread or outside a threaded region while the typical accesses to that data are performed in a threaded manner. The problem with this is that a data's affinity to a thread (usually a core as well) is typically defined by first touch. You can fix this by initializing your data with the same threads you use for the calculations later.

Other threading recommendations:  

  1. Minimize entrances and exits to parallel regions. OMP_WAIT_POLICY=ACTIVE can get around this, but it's more robust to make parallel regions as long as possible.  
  2. Thread over nested loops using the collapse clause or explicit division-mod arithmetic.