Floating Point Exceptions
Nominally both ISAs support the IEEE-754 specifications, but there are
some subtle differences. Both architectures define a status and control register
to record exceptions and enable / disable floating point exceptions for program
interrupt or default action. Intel has a MXCSR and PowerISA has a FPSCR which
basically do the same thing but with different bit layout.
Intel provides _mm_setcsr / _mm_getcsr
intrinsic functions to allow direct access to the MXCSR.
This might have been useful in the early days before the OS run-times were
updated to manage the MXCSR via the POSIX APIs. Today this would be
highly discouraged with a strong preference to use the POSIX APIs
(feclearexceptflag,
fegetexceptflag,
fesetexceptflag, ...) instead.
If we implement _mm_setcsr /
_mm_getcs at all, we should simply
redirect the implementation to use the POSIX APIs from
<fenv.h>. But it
might be simpler just to replace these intrinsics with macros that generate
#error.
The Intel MXCSR does have some non- (POSIX/IEEE754) standard quirks:
The Flush-To-Zero and Denormals-Are-Zeros flags. This simplifies the hardware
response to what should be a rare condition (underflows where the result can
not be represented in the exponent range and precision of the format) by simply
returning a signed 0.0 value. The intrinsic header implementation does provide
constant masks for _MM_DENORMALS_ZERO_ON
(<pmmintrin.h>) and
_MM_FLUSH_ZERO_ON (<xmmintrin.h>),
so technically it is available to users
of the Intel Intrinsics API.
The VMX Vector facility provides a separate Vector Status and Control
register (VSCR) with a Non-Java Mode control bit. This control combines the
flush-to-zero semantics for floating point underflow and denormal values. But
this control only applies to VMX vector float instructions and does not apply
to VSX scalar floating Point or vector double instructions. The FPSCR does
define a Floating-Point non-IEEE mode which is optional in the architecture.
This would apply to Scalar and VSX floating-point operations if it were
implemented. This was largely intended for embedded processors and is not
implemented in the POWER processor line.
As the flush-to-zero is primarily a performance enhancement and is
clearly outside the IEEE-754 standard, it may be best to simply ignore this
option for the intrinsic port.