From ad2c1c167177ffa8664c7d0e427d0b4d713b66db Mon Sep 17 00:00:00 2001 From: Bill Schmidt Date: Tue, 26 Jun 2018 17:01:31 -0500 Subject: [PATCH] Changes through vec_xl_be, plus vec_xst. Signed-off-by: Bill Schmidt --- Intrinsics_Reference/ch_vec_reference.xml | 680 ++++++++++++++-------- 1 file changed, 443 insertions(+), 237 deletions(-) diff --git a/Intrinsics_Reference/ch_vec_reference.xml b/Intrinsics_Reference/ch_vec_reference.xml index 7cd809c..19f8792 100644 --- a/Intrinsics_Reference/ch_vec_reference.xml +++ b/Intrinsics_Reference/ch_vec_reference.xml @@ -18355,7 +18355,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> r contains element 0 of a, truncated to a signed integer. Element 2 of r contains element 1 of a, truncated to a signed integer. + role="bold">a, truncated to a signed integer. Elements 1 and + 3 of r are undefined. Endian considerations: The element numbering within a register is left-to-right for big-endian targets, and right-to-left for little-endian targets. @@ -18400,13 +18401,13 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - xvdvdpsxws t,a + xvcvdpsxws t,a vsldoi r,t,t,12 - xvdvdpsxws t,a + xvcvdpsxws t,a @@ -18433,7 +18434,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> r contains element 0 of a, truncated to a signed integer. Element 3 of r contains element 1 of a, truncated to a signed integer. + role="bold">a, truncated to a signed integer. Elements 0 and + 2 of r are undefined. Endian considerations: The element numbering within a register is left-to-right for big-endian targets, and right-to-left for little-endian targets. @@ -23108,21 +23110,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> role="bold">a and element n of b using saturated addition. Endian considerations: - The element numbering within a register is left-to-right for big-endian - targets, and right-to-left for little-endian targets. + None. - Notes: - Issue #438 in the power-gcc github tracker has been opened - for wrong little-endian behavior. Supported type signatures for vec_sum4s - + - @@ -23132,20 +23129,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - ARG1 + a - ARG2 + b - Example LE - Implementation - - - Example BE + Example Implementation @@ -23163,11 +23156,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - TBD - - - - vsum4sbs r,a,b @@ -23184,11 +23172,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - TBD - - - - vsum4shs r,a,b @@ -23205,11 +23188,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - TBD - - - - vsum4ubs r,a,b @@ -24088,15 +24066,19 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_unsigned - Vector ... Spelled Out Name TBD + Vector Convert Floating-Point to Unsigned Integer - r = vec_unsigned (ARG1) + r = vec_unsigned (a) Purpose: - Converts a vector of double-precision numbers to a vector of unsigned integers. + Converts a vector of floating-point numbers to a vector of unsigned + integers. - Result value: Target elements are obtained by truncating the respective source elements to unsigned integers. + Result value: Each element of + r is obtained by truncating the + corresponding element of a to an + unsigned integer. Endian considerations: None. @@ -24116,7 +24098,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - ARG1 + a @@ -24133,7 +24115,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vector float - sample implementation TBD + + xvcvspsxws r,a + @@ -24144,7 +24128,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vector double - sample implementation TBD + + xvcvdpsxds r,a + @@ -24156,53 +24142,58 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_unsigned2 - Vector ... Spelled Out Name TBD + Vector Convert Double-Precision to Unsigned Word - r = vec_unsigned2 (ARG1, ARG2) + r = vec_unsigned2 (a, b) Purpose: - Converts a vector of double-precision numbers to a vector of unsigned integers. - - Result value: Target elements are obtained by truncating the source elements to the unsigned integers as follows: - - - Target elements 0 and 1 from source 0 - - - Target elements 2 and 3 from source 1 - - + Converts two vectors of double-precision floating-point numbers to a + vector of unsigned 32-bit integers. + + Result value: Let v be the concatenation of a and b. Each + element of r is obtained by truncating + the corresponding element of v to an + unsigned 32-bit integer. Endian considerations: - None. + The element numbering within a register is left-to-right for big-endian + targets, and right-to-left for little-endian targets.
Supported type signatures for vec_unsigned2 - + + - + r - + - ARG1 + a - + - ARG2 + b - Example Implementation + Example LE + Implementation + + + Example BE + Implementation @@ -24215,10 +24206,25 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vector double - vector double + vector double - sample implementation TBD + + xxpermdi t,b,a,3 + xxpermdi u,b,a,0 + xvcvdpuxws v,t + xvcvdpuxws w,u + vmrgow r,w,v + + + + + xxpermdi t,a,b,3 + xxpermdi u,a,b,0 + xvcvdpuxws v,t + xvcvdpuxws w,u + vmrgew r,v,w + @@ -24230,41 +24236,55 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_unsignede - Vector ... Spelled Out Name TBD + Vector Convert Double-Precision to Unsigned Word + Even - r = vec_unsignede (ARG1) + r = vec_unsignede (a) Purpose: - Converts an input vector to a vector of unsigned integers. + Converts elements of an input vector to unsigned integers and stores + them in the even-numbered elements of the result vector. - Result value: The even target elements are obtained by truncating the source elements to unsigned integers as follows: -Target elements 0 and 2 contain the converted values of the - input vector. + Result value: Element 0 of + r contains element 0 of a, truncated to an unsigned integer. Element 2 of + r contains element 1 of a, truncated to a signed integer. Elements 1 and + 3 of r are undefined. Truncation + of a negative number to an unsigned integer results in a value of + zero. Endian considerations: - None. + The element numbering within a register is left-to-right for big-endian + targets, and right-to-left for little-endian targets.
Supported type signatures for vec_unsignede - + + - + r - + - ARG1 + a - Example Implementation + Example LE + Implementation + + + Example BE + Implementation @@ -24277,7 +24297,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vector double - sample implementation TBD + + xvcvdpuxws t,a + vsldoi r,t,t,12 + + + + + xvcvdpuxws r,a + + @@ -24289,41 +24318,54 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_unsignedo - Vector ... Spelled Out Name TBD + Vector Convert Double-Precision to Unsigned Word Odd - r = vec_unsignedo (ARG1) + r = vec_unsignedo (a) Purpose: - Converts an input vector to a vector of unsigned integers. + Converts elements of an input vector to unsigned integers and stores + them in the odd-numbered elements of the result vector. - Result value: The odd target elements are obtained by truncating the source elements to unsigned integers as follows: -Target elements 1 and 3 contain the converted values of the - input vector. + Result value: Element 1 of + r contains element 0 of a, truncated to an unsigned integer. Element 3 of + r contains element 1 of a, truncated to an unsigned integer. Elements 0 + and 2 of r are undefined. Truncation + of a negative number to an unsigned integer results in a value of + zero. Endian considerations: - None. + The element numbering within a register is left-to-right for big-endian + targets, and right-to-left for little-endian targets.
Supported type signatures for vec_unsignedo - + + - + r - + - ARG1 + a - Example Implementation + Example LE + Implementation + + + Example BE + Implementation @@ -24336,7 +24378,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vector double - sample implementation TBD + + xvcvdpuxws r,a + + + + + + xvcvdpuxws t,a + vsldoi r,t,t,12 + @@ -24348,25 +24399,45 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_xl - Vector ... Spelled Out Name TBD + VSX Unaligned Load - r = vec_xl (ARG1, ARG2) + r = vec_xl (a, b) Purpose: - Loads a 16-byte vector from the memory address specified by the displacement and the pointer. - - Result value: This function adds the displacement and the pointer R-value to obtain the address for the load operation. - - For languages that support built-in - methods for pointer dereferencing, such as the C/C++ pointer - dereference * and array access [ ] operators, use of the - native operators is encouraged and use of the vec_xl - intrinsic is discouraged. - + Loads a 16-byte vector from the memory address specified by the + displacement and the pointer. + + Result value: The value of + r is obtained by adding a and b, then + loading the 16-byte vector from the resultant memory address. Endian considerations: - None. + For ISA 2.07, there is no bi-endian unaligned load instruction. + For little-endian targets, it is necessary to use the lxvd2x instruction + and swap the doublewords with an xxswapd instruction. For big-endian + targets, the lxvd2x instruction or lxvw4x instruction suffices. The + examples below assume ISA 3.0, where the bi-endian lxv instruction is + available. + Notes: + + + + For languages that support built-in methods for pointer + dereferencing, such as the C/C++ * and [ ] operators, use of the + native operators is encouraged when the memory to be accessed is + aligned on a 32-bit boundary or aligned to the type of b, whichever is weaker. + + + + + No Power compilers yet support the vector _Float16 type, so that + interface is currently deferred. + + +
Supported type signatures for vec_xl @@ -24378,25 +24449,26 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> - + r - + - ARG1 + a - + - ARG2 + b - Example Implementation + Example ISA 3.0 + Implementation - + Restrictions @@ -24410,10 +24482,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed long long - signed char * + signed char * - sample implementation TBD + + lxv r,a(b) + @@ -24430,7 +24504,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned char * - sample implementation TBD + + lxv r,a(b) + @@ -24447,7 +24523,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed int * - sample implementation TBD + + lxv r,a(b) + @@ -24464,7 +24542,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned int * - sample implementation TBD + + lxv r,a(b) + @@ -24481,7 +24561,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed __int128 * - sample implementation TBD + + lxv r,a(b) + @@ -24498,7 +24580,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned __int128 * - sample implementation TBD + + lxv r,a(b) + @@ -24515,7 +24599,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed long long * - sample implementation TBD + + lxv r,a(b) + @@ -24532,7 +24618,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned long long * - sample implementation TBD + + lxv r,a(b) + @@ -24549,7 +24637,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed short * - sample implementation TBD + + lxv r,a(b) + @@ -24566,7 +24656,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned short * - sample implementation TBD + + lxv r,a(b) + @@ -24583,7 +24675,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> double * - sample implementation TBD + + lxv r,a(b) + @@ -24600,7 +24694,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> float * - sample implementation TBD + + lxv r,a(b) + @@ -24617,10 +24713,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> _Float16 * - sample implementation TBD + + lxv r,a(b) + - ISA 3.0 or later + Deferred @@ -24632,51 +24730,66 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_xl_be - Vector ... Spelled Out Name TBD + VSX Unaligned Load as Big Endian - r = vec_xl_be (ARG1, ARG2) + r = vec_xl_be (a, b) Purpose: - In little-endian environments, loads the elements of the 16-byte vector ARG1 starting with the highest-numbered element at the memory address specified by the displacement ARG1 and the pointer ARG2. In big-endian environments, this operator performs the same operation as VEC_XL. - - Result value: In little-endian mode, loads the elements of the vector in sequential order, with the highest-numbered element loaded from the lowest data address and the lowest-numbered element of the vector at the highest address. All elements are loaded in little-endian data format. -This function adds the displacement and the pointer R-value - to obtain the address for the load operation. It does not - truncate the affected address to a multiple of 16 bytes. + Loads a vector from an address into a register in big-endian element + order, regardless of the endianness of the target machine. + + Result value: The value of + r is obtained by adding a and b, then + loading the vector elements from the resulting address in big-endian + order. Endian considerations: - None. + In big-endian mode, this acts just like the vec_xl intrinsic. + In little-endian mode, the highest-numbered element of r is loaded from the lowest data address, and + the lowest-numbered element of r from + the highest data address. + Notes: + No Power compilers yet support the vector _Float16 type, so that + interface is currently deferred.
Supported type signatures for vec_xl_be - + + - + r - + - ARG1 + a - + - ARG2 + b - - Example Implementation + + Example ISA 3.0 LE + Implementation - + + Example ISA 3.0 BE + Implementation + + Restrictions @@ -24690,10 +24803,17 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed long long - signed char * + signed char * - sample implementation TBD + + lxvb16x r,a,b + + + + + lxv r,a,b + @@ -24710,7 +24830,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned char * - sample implementation TBD + + lxvb16x r,a,b + + + + + lxv r,a,b + @@ -24727,7 +24854,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed int * - sample implementation TBD + + lxvw4x r,a,b + + + + + lxv r,a,b + @@ -24744,7 +24878,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned int * - sample implementation TBD + + lxvw4x r,a,b + + + + + lxv r,a,b + @@ -24761,7 +24902,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed __int128 * - sample implementation TBD + + lxv r,a,b + + + + + lxv r,a,b + @@ -24778,7 +24926,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned __int128 * - sample implementation TBD + + lxv r,a,b + + + + + lxv r,a,b + @@ -24795,7 +24950,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed long long * - sample implementation TBD + + lxvd2x r,a,b + + + + + lxv r,a,b + @@ -24812,7 +24974,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned long long * - sample implementation TBD + + lxvd2x r,a,b + + + + + lxv r,a,b + @@ -24829,7 +24998,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed short * - sample implementation TBD + + lxvh8x r,a,b + + + + + lxv r,a,b + @@ -24846,7 +25022,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned short * - sample implementation TBD + + lxvh8x r,a,b + + + + + lxv r,a,b + @@ -24863,7 +25046,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> double * - sample implementation TBD + + lxvd2x r,a,b + + + + + lxv r,a,b + @@ -24880,7 +25070,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> float * - sample implementation TBD + + lxvw4x r,a,b + + + + + lxv r,a,b + @@ -24897,10 +25094,17 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> _Float16 * - sample implementation TBD + + lxvh8x r,a,b + + + + + lxv r,a,b + - ISA 3.0 or later + Deferred @@ -25520,70 +25724,82 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> vec_xst - Vector ... Spelled Out Name TBD + VSX Unaligned Store - r = vec_xst (ARG1, ARG2, ARG3) + vec_xst (a, b, c) Purpose: - + Stores a 16-byte value into memory at the address specified by the + displacement and pointer. - Result value: Stores the provided vector in memory. - - For languages that support built-in - methods for pointer dereferencing, such as the C/C++ pointer - dereference * and array access [ ] operators, use of the - native operators is encouraged and use of the vec_xl - intrinsic is discouraged. - + Operation: The values of + b and c are added, and the value of a is stored to the resultant address. Endian considerations: - None. + For ISA 2.07, there is no bi-endian unaligned store instruction. For + little-endian targets, it is necessary to first swap the doublewords + of the value to be stored using an xxswapd instruction, and then store + the result using the stxvd2x instruction. For big-endian targets, the + stxvd2x or stxvw4x instruction suffices. The examples below assume ISA + 3.0, where the bi-endian stxv instruction is available. + Notes: + + + + For languages that support built-in methods for pointer + dereferencing, such as the C/C++ * and [ ] operators, use of the + native operators is encouraged when the memory to be accessed is + aligned on a 32-bit boundary or aligned to the type of b, whichever is weaker. + + + + + No Power compilers yet support the vector _Float16 type, so that + interface is currently deferred. + + +
Supported type signatures for vec_xst - + - - - - r - - - + - ARG1 + a - + - ARG2 + b - + - ARG3 + c - - Example Implementation + + Example ISA 3.0 + Implementation - + Restrictions - - void - vector signed char @@ -25594,16 +25810,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed char * - sample implementation TBD + + stxv a,b(c) + - - void - vector unsigned char @@ -25614,16 +25829,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned char * - sample implementation TBD + + stxv a,b(c) + - - void - vector signed int @@ -25634,16 +25848,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed int * - sample implementation TBD + + stxv a,b(c) + - - void - vector unsigned int @@ -25654,16 +25867,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned int * - sample implementation TBD + + stxv a,b(c) + - - void - vector signed __int128 @@ -25674,16 +25886,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed __int128 * - sample implementation TBD + + stxv a,b(c) + - - void - vector unsigned __int128 @@ -25694,16 +25905,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned __int128 * - sample implementation TBD + + stxv a,b(c) + - - void - vector signed long long @@ -25714,16 +25924,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed long long * - sample implementation TBD + + stxv a,b(c) + - - void - vector unsigned long long @@ -25734,16 +25943,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned long long * - sample implementation TBD + + stxv a,b(c) + - - void - vector signed short @@ -25754,16 +25962,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> signed short * - sample implementation TBD + + stxv a,b(c) + - - void - vector unsigned short @@ -25774,16 +25981,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> unsigned short * - sample implementation TBD + + stxv a,b(c) + - - void - vector double @@ -25794,16 +26000,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> double * - sample implementation TBD + + stxv a,b(c) + - - void - vector float @@ -25814,16 +26019,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> float * - sample implementation TBD + + stxv a,b(c) + - - void - vector _Float16 @@ -25834,7 +26038,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_vec_intrinsics"> _Float16 * - sample implementation TBD + + stxv a,b(c) + ISA 3.0 or later