Vector Programming Techniques
Help the Compiler Help You
Start with scalar code, which is the most portable. Use various
tricks for helping the compiler vectorize scalar code. Make
sure you align your data on 16-byte boundaries wherever
possible, and tell the compiler it's aligned. Use __restrict__
pointers to promise data does not alias.
Use Portable Intrinsics
Individual compilers may provide other intrinsic support. Only
the intrinsics in this manual are guaranteed to be portable
across compliant compilers.
Some compilers may provide compatibility headers for use with
other architectures. Recent GCC and Clang compilers support
compatibility headers for the lower levels of the x86 vector
architecture. These can be used initially for ease of porting,
but for best performance, it is preferable to rewrite important
sections of code with native Power intrinsics.
Use Assembly Code Sparingly
filler