From 74b9b120f26e0fbfcb8ca30c285eb3ed6cfc0060 Mon Sep 17 00:00:00 2001 From: "Paul A. Clarke" Date: Wed, 13 May 2020 19:22:03 -0500 Subject: [PATCH 1/3] Consistently specify type of input in examples Change occurrences of "An example follows" to "An example for input _i_ of type _t_ follows", in cases where an intrinsic has more than one possible input type. Signed-off-by: Paul A. Clarke --- Intrinsics_Reference/ch_vec_reference.xml | 40 ++++++++++++++--------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/Intrinsics_Reference/ch_vec_reference.xml b/Intrinsics_Reference/ch_vec_reference.xml index eba5708..df3e9c4 100644 --- a/Intrinsics_Reference/ch_vec_reference.xml +++ b/Intrinsics_Reference/ch_vec_reference.xml @@ -10891,7 +10891,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> set to the number of leading zeros of the corresponding element of a. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -11285,7 +11286,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> element) of a that have a least-significant bit of zero. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -11802,7 +11804,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> element) of a that have a least-significant bit of zero. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -12789,8 +12792,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> the converted values of elements 0 and 2 of a. - An example where a - is of type vector signed int follows: + An example for input a + of type vector signed int follows: @@ -13016,8 +13019,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> the converted values of elements 0 and 1 of a. - An example where a - is of type vector signed int follows: + An example for input a + of type vector signed int follows: @@ -13249,8 +13252,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> the converted values of elements 2 and 3 of a. - An example where a - is of type vector signed int follows: + An example for input a + of type vector signed int follows: @@ -13482,8 +13485,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> the converted values of elements 1 and 3 of a. - An example where a - is of type vector signed int follows: + An example for input a + of type vector signed int follows: @@ -15315,7 +15318,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> element index of the position of the first character match in natural element order. If no match, returns the number of characters as an element count in the vector argument. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -15786,7 +15790,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> either the first character match or an end-of-string (EOS) terminator. If no match or terminator, returns the number of characters as an element count in the vector argument. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -16324,7 +16329,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> element index of the position of the first character mismatch in natural element order. If no mismatch, returns the number of characters as an element count in the vector argument. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -16779,7 +16785,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> either the first character mismatch or an end-of-string (EOS) terminator. If no mismatch or terminator, returns the number of characters as an element count in the vector argument. - An example follows: + An example for input a + of type vector unsigned char follows: @@ -27329,7 +27336,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> is set to the exclusive-OR of byte elements x of a and y of b. - An example follows: + An example for input a + of type vector unsigned char follows: -- 2.34.1 From 187e9055f73d4b8e35d06340a5458373561903de Mon Sep 17 00:00:00 2001 From: "Paul A. Clarke" Date: Wed, 13 May 2020 21:50:50 -0500 Subject: [PATCH 2/3] Add examples for vec_unpack[hl] Fixes #28. Signed-off-by: Paul A. Clarke --- Intrinsics_Reference/ch_vec_reference.xml | 1164 +++++++++++++++++---- 1 file changed, 949 insertions(+), 215 deletions(-) diff --git a/Intrinsics_Reference/ch_vec_reference.xml b/Intrinsics_Reference/ch_vec_reference.xml index df3e9c4..c049ac8 100644 --- a/Intrinsics_Reference/ch_vec_reference.xml +++ b/Intrinsics_Reference/ch_vec_reference.xml @@ -39299,11 +39299,159 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> of r is the value of the corresponding element of the most-significant half of a. + An example for input a + of type vector signed int follows: + + + + + + + + + + + + + doubleword index + + + 0 + + + 1 + + + + + word index + + + 0 + + + 1 + + + 2 + + + 3 + + + + + + + r + + + 10111213 + + + 24252627 + + + ???????? + + + ???????? + + + + + a + + + 0000000010111213 + + + 0000000024252627 + + + + + + If a is a floating-point vector, the value of each element of r is the value of the corresponding element of the most-significant half of a, widened to the result precision. + An example for input a + of type vector float follows: + + + + + + + + + + + + + doubleword index + + + 0 + + + 1 + + + + + word index + + + 0 + + + 1 + + + 2 + + + 3 + + + + + + + r + + + -2.71828182 + + + 3.14159265 + + + ???????? + + + ???????? + + + + + a + + + -2.71828182 + + + 3.14159265 + + + + + + If a is a pixel vector, the value of each element of r is taken from the corresponding element of the most-significant half of role="bold">a. - Endian considerations: - The "high" half of a vector with n elements is the - first n/2 elements of the vector. For little - endian, these elements are in the rightmost half of the vector. For - big endian, these elements are in the leftmost half of the vector. - - - - vupklsh - vec_unpackh - - - vupkhsh - vec_unpackh - - - vupklpx - vec_unpackh - - - vupkhpx - vec_unpackh - - - vupklsw - vec_unpackh - - - vupkhsw - vec_unpackh - - - vupklsb - vec_unpackh - - - vupkhsb - vec_unpackh - - - xxsldwi - vec_unpackh - - - xvcvspdp - vec_unpackh - - - - Supported type signatures for vec_unpackh - - - - - + An example follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + word index + + + 0 + + + 1 + + + 2 + + + 3 + + + + + halfword index + + + 0 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + + + + + a + + + 1234 + + + 2567 + + + 489A + + + 8BCD + + + ???? + + + ???? + + + ???? + + + ???? + + + + + unpack halfwords to words + + + 1234 + + + 2567 + + + 489A + + + 8BCD + + + + + as bits + + + 0001001000110100 + + + 0010010101100111 + + + 0100100010011010 + + + 1000101111001101 + + + + + as 1-5-5-5 + + + 0 + + + 00100 + + + 10001 + + + 10100 + + + 0 + + + 01001 + + + 01011 + + + 00111 + + + 0 + + + 10010 + + + 00100 + + + 11010 + + + 1 + + + 00010 + + + 11110 + + + 01101 + + + + + r + + + 00041114 + + + 00090B07 + + + 0012041A + + + FF021E0D + + + + + + + + Endian considerations: + The "high" half of a vector with n elements is the + first n/2 elements of the vector. For little + endian, these elements are in the rightmost half of the vector. For + big endian, these elements are in the leftmost half of the vector. + + + + vupklsh + vec_unpackh + + + vupkhsh + vec_unpackh + + + vupklpx + vec_unpackh + + + vupkhpx + vec_unpackh + + + vupklsw + vec_unpackh + + + vupkhsw + vec_unpackh + + + vupklsb + vec_unpackh + + + vupkhsb + vec_unpackh + + + xxsldwi + vec_unpackh + + + xvcvspdp + vec_unpackh + + +
+ Supported type signatures for vec_unpackh + + + + + + + + + + r + + + + + a + + + + Example LE + Implementation + + + Example BE + Implementation + + + + + + + vector bool short + + + vector bool char + + + + vupklsb r,a + + + + + vupkhsb r,a + + + + + + vector signed short + + + vector signed char + + + + vupklsb r,a + + + + + vupkhsb r,a + + + + + + vector bool int + + + vector bool short + + + + vupklsh r,a + + + + + vupkhsh r,a + + + + + + vector signed int + + + vector signed short + + + + vupklsh r,a + + + + + vupkhsh r,a + + + + + + vector unsigned int + + + vector pixel + + + + vupklpx r,a + + + + + vupkhpx r,a + + + + + + vector bool long long + + + vector bool int + + + + vupklsw r,a + + + + + vupkhsw r,a + + + + + + vector signed long long + + + vector signed int + + + + vupklsw r,a + + + + + vupkhsw r,a + + + + + + vector double + + + vector float + + + + xxsldwi t,a,a,3 + xxsldwi u,a,t,2 + xvcvspdp r,u + + + + + xxsldwi t,a,a,1 + xxsldwi u,t,a,3 + xvcvspdp r,u + + + + + +
+ + + + + + vec_unpackl + Vector Unpack Low + + r = vec_unpackl (a) + + + Purpose: + Unpacks the least-significant (“low”) half of a vector into a vector + with larger elements. + + Result value: If a is an integer vector, the value of each element + of r is the value of the corresponding + element of the least-significant half of a. + An example for input a + of type vector signed int follows: + + + + + + + + + + + + + doubleword index + + + 0 + + + 1 + + + + + word index + + + 0 + + + 1 + + + 2 + + + 3 + + + + + + + r + + + ???????? + + + ???????? + + + 38393A3B + + + 4C4D4E4F + + + + + a + + + 0000000038393A3B + + + 000000004C4D4E4F + + + + + + + If a is a floating-point vector, + the value of each element of r is the + value of the corresponding element of the least-significant half of + a, widened to the result + precision. + An example for input a + of type vector float follows: + + + + + + + + + + + + + doubleword index + + + 0 + + + 1 + + + + + word index + + + 0 + + + 1 + + + 2 + + + 3 + + + + + + + r + + + ???????? + + + ???????? + + + 6.0221409e+23 + + + 1.61803398875 + + + + + a + + + 6.0221409e+23 + + + 1.61803398875 + + + + + + + If a is a pixel vector, the value + of each element of r is taken from the + corresponding element of the least-significant half of a as follows: + + + All bits in the first byte of the element of r are set to the value of the first bit of + the element of a. + + + The least-significant 5 bits of the second byte of the + element of r are set to the value + of the next 5 bits in the element of a. + + + The least-significant 5 bits of the third byte of the + element of r are set to the value + of the next 5 bits in the element of a. + + + The least-significant 5 bits of the fourth byte of the + element of r are set to the value + of the next 5 bits in the element of a. + + + An example follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - r - + word index + + 0 + + + 1 + + + 2 + + + 3 + + + - - a - + halfword index - - Example LE - Implementation + + 0 - - Example BE - Implementation + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 - vector bool short + a - - vector bool char + + ???? - - - vupklsb r,a - + + ???? - - - vupkhsb r,a - + + ???? + + + ???? + + + DCB8 + + + A984 + + + 7652 + + + 4321 - vector signed short + unpack halfwords to words - - vector signed char + + DCB8 - - - vupklsb r,a - + + A984 - - - vupkhsb r,a - + + 7652 + + + 4321 - vector bool int + as bits - - vector bool short + + 1101110010111000 - - - vupklsh r,a - + + 1010100110000100 - - - vupkhsh r,a - + + 0111011001010010 + + + 0100001100100001 - vector signed int + as 1-5-5-5 - vector signed short + 1 - - - vupklsh r,a - + + 10111 - - - vupkhsh r,a - + + 00101 - - - vector unsigned int + 11000 - vector pixel + 1 - - - vupklpx r,a - + + 01010 - - - vupkhpx r,a - + + 01100 - - - vector bool long long + 00100 - vector bool int + 0 - - - vupklsw r,a - + + 11101 - - - vupkhsw r,a - + + 10010 - - - vector signed long long + 10010 - vector signed int + 0 - - - vupklsw r,a - + + 10000 - - - vupkhsw r,a - + + 11001 + + + 00001 - vector double + r - - vector float + + FF170518 - - - xxsldwi t,a,a,3 - xxsldwi u,a,t,2 - xvcvspdp r,u - + + FF0A0C04 - - - xxsldwi t,a,a,1 - xxsldwi u,t,a,3 - xvcvspdp r,u - + + 001D1212 + + + 00101901 - - - - - - - vec_unpackl - Vector Unpack Low - - r = vec_unpackl (a) - - - Purpose: - Unpacks the least-significant (“low”) half of a vector into a vector - with larger elements. + - Result value: If a is an integer vector, the value of each element - of r is the value of the corresponding - element of the least-significant half of a. - If a is a floating-point vector, - the value of each element of r is the - value of the corresponding element of the least-significant half of - a, widened to the result - precision. - If a is a pixel vector, the value - of each element of r is taken from the - corresponding element of the least-significant half of a as follows: - - - All bits in the first byte of the element of r are set to the value of the first bit of - the element of a. - - - The least-significant 5 bits of the second byte of the - element of r are set to the value - of the next 5 bits in the element of a. - - - The least-significant 5 bits of the third byte of the - element of r are set to the value - of the next 5 bits in the element of a. - - - The least-significant 5 bits of the fourth byte of the - element of r are set to the value - of the next 5 bits in the element of a. - - + Endian considerations: The "high" half of a vector with n elements is the first n/2 elements of the vector. For little -- 2.34.1 From 2d3eb1c0ea0a422171f06cac3bd1e7e10594763a Mon Sep 17 00:00:00 2001 From: "Paul A. Clarke" Date: Thu, 14 May 2020 15:12:06 -0500 Subject: [PATCH 3/3] Change references to intrinsics into links Signed-off-by: Paul A. Clarke --- Intrinsics_Reference/ch_biendian.xml | 178 +++++++++++++++------------ 1 file changed, 96 insertions(+), 82 deletions(-) diff --git a/Intrinsics_Reference/ch_biendian.xml b/Intrinsics_Reference/ch_biendian.xml index dd368fc..a8557cf 100644 --- a/Intrinsics_Reference/ch_biendian.xml +++ b/Intrinsics_Reference/ch_biendian.xml @@ -180,8 +180,9 @@ vector unsigned __int128 x = { (((unsigned __int128)0x1020304050607080) << to access the Nth vector element from a vector pointer. The dereference operator * may not be used to access data that is not - aligned at least to a quadword boundary. Built-in functions - such as vec_xl and vec_xst are + aligned at least to a quadword boundary. Built-in functions such as + and + and provided for unaligned data access. Please refer to for an example. @@ -796,238 +797,238 @@ a[3] = c; - vec_bperm + - vec_mergeh + - vec_signedo + - vec_cipher_be + - vec_mergel + - vec_sld + - vec_cipherlast_be + - vec_mergeo + - vec_sldw + - vec_doublee + - vec_mfvscr + - vec_sll + - vec_doubleh + - vec_mule + - vec_slo + - vec_doublel + - vec_mulo + - vec_slv + - vec_doubleo + - vec_ncipher_be + - vec_splat + - vec_extract + - vec_ncipherlast_be + - vec_srl + - vec_extract_fp32_from_shorth + - vec_pack + - vec_sro + - vec_extract_fp32_from_shortl + - vec_pack_to_short_fp32 + - vec_srv + - vec_extract4b + - vec_packpx + - vec_sum2s + - vec_first_match_index + - vec_packs + - vec_sums + - vec_first_match_or_eos_index + - vec_packsu + - vec_unpackh + - vec_first_mismatch_index + - vec_perm + - vec_unpackl + - vec_first_mismatch_or_eos_index + - vec_permxor + - vec_unsigned2 + - vec_float2 + - vec_pmsum_be + - vec_unsignede + - vec_floate + - vec_reve + - vec_unsignedo + - vec_floato + - vec_sbox_be + - vec_xl (ISA 2.07 only) + (ISA 2.07 only) - vec_gb + - vec_shasigma_be + - vec_xl_be + - vec_insert + - vec_signed2 + - vec_xst (ISA 2.07 only) + (ISA 2.07 only) - vec_insert4b + - vec_signede + - vec_xst_be + - vec_mergee + @@ -1056,7 +1057,8 @@ a[3] = c; Before the bi-endian programming model was introduced, the vec_lvsl and vec_lvsr intrinsics were supported. These could be used in conjunction with - vec_perm and VMX load and store instructions for + + and VMX load and store instructions for unaligned access. The vec_lvsl and vec_lvsr interfaces are deprecated in accordance with the interfaces specified here. For compatibility, the @@ -1097,7 +1099,7 @@ a[3] = c; - vec_ld + lvx @@ -1108,7 +1110,7 @@ a[3] = c; - vec_lde + lvebx, lvehx, lvewx @@ -1119,7 +1121,7 @@ a[3] = c; - vec_ldl + lvxl @@ -1130,7 +1132,7 @@ a[3] = c; - vec_st + stvx @@ -1141,7 +1143,7 @@ a[3] = c; - vec_ste + stvebx, stvehx, stvewx @@ -1152,7 +1154,7 @@ a[3] = c; - vec_stl + stvxl @@ -1166,7 +1168,9 @@ a[3] = c; Instead, it is recommended that programmers use the - vec_xl and vec_xst vector built-in + and + + vector built-in functions to access unaligned data streams. See the descriptions of these instructions in for further description and @@ -1479,35 +1483,44 @@ a[3] = c; vec_sld and vec_sro are not bi-endian One oddity in the bi-endian vector programming model is that - vec_sld has big-endian semantics for code + + has big-endian semantics for code compiled for both big-endian and little-endian targets. That - is, any code that uses vec_sld without guarding + is, any code that uses + + without guarding it with a test on endianness is likely to be incorrect. At the time that the bi-endian model was being developed, it was discovered that existing code in several Linux packages - was using vec_sld in order to perform multiplies, + was using + + in order to perform multiplies, or to otherwise shift portions of base elements left. A straightforward little-endian implementation of - vec_sld would concatenate the two input vectors + + would concatenate the two input vectors in reverse order and shift bytes to the right. This would only give compatible results for vector char types. Those using this intrinsic as a cheap multiply, or to shift bytes within larger elements, would see different results on little-endian versus big-endian with such an implementation. Therefore it was decided that - vec_sld would not have a bi-endian + + would not have a bi-endian implementation. - vec_sro is not bi-endian for similar reasons. + + is not bi-endian for similar reasons.
Limitations on bi-endianness of vec_perm - The vec_perm intrinsic is bi-endian, provided + The + intrinsic is bi-endian, provided that it is used to reorder entire elements of the input vectors. @@ -2533,7 +2546,8 @@ a[3] = c; - The lesson here is to only use vec_perm to + The lesson here is to only use + to reorder entire elements of a vector. If you must use vec_perm for another purpose, your code must include a test for endianness and separate algorithms for big- and -- 2.34.1