Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Vectorization is a little harder - it relies on uncoupled data which we are doing the same things with being layed out in the same way that our vectors are. The variables maintained by gauss points of different elements in Homme's dycore are good examples of this, as they are rarely dependent on each other. This suggests putting the element index last, as this allows direct reads from memory to vector registers. This can significantlyconflict with the cache locality requirement, especially when threading over elements or using MPI, so an alternative is "stripe" our data with values from different elements (levels might also work).

...