|
|
<!--
|
|
|
Copyright (c) 2019 OpenPOWER Foundation
|
|
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
you may not use this file except in compliance with the License.
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
See the License for the specific language governing permissions and
|
|
|
limitations under the License.
|
|
|
|
|
|
-->
|
|
|
<chapter version="5.0" xml:lang="en" xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
|
xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian">
|
|
|
|
|
|
<!-- Chapter Title goes here. -->
|
|
|
<title>The POWER Bi-Endian Vector Programming Model</title>
|
|
|
|
|
|
<para>
|
|
|
To ensure portability of applications optimized to exploit the
|
|
|
SIMD functions of POWER ISA processors, the ELF V2 ABI defines a
|
|
|
set of functions and data types for SIMD programming. ELF
|
|
|
V2-compliant compilers will provide suitable support for these
|
|
|
functions, preferably as built-in functions that translate to one
|
|
|
or more POWER ISA instructions.
|
|
|
</para>
|
|
|
<para>
|
|
|
Compilers are encouraged, but not required, to provide built-in
|
|
|
functions to access individual instructions in the IBM POWER®
|
|
|
instruction set architecture. In most cases, each such built-in
|
|
|
function should provide direct access to the underlying
|
|
|
instruction.
|
|
|
</para>
|
|
|
<para>
|
|
|
However, to ease porting between little-endian (LE) and big-endian
|
|
|
(BE) POWER systems, and between POWER and other platforms, it is
|
|
|
preferable that some built-in functions provide the same semantics
|
|
|
on both LE and BE POWER systems, even if this means that the
|
|
|
built-in functions are implemented with different instruction
|
|
|
sequences for LE and BE. To achieve this, vector built-in
|
|
|
functions provide a set of functions derived from the set of
|
|
|
hardware functions provided by the Power vector SIMD
|
|
|
instructions. Unlike traditional “hardware intrinsic” built-in
|
|
|
functions, no fixed mapping exists between these built-in
|
|
|
functions and the generated hardware instruction sequence. Rather,
|
|
|
the compiler is free to generate optimized instruction sequences
|
|
|
that implement the semantics of the program specified by the
|
|
|
programmer using these built-in functions.
|
|
|
</para>
|
|
|
<para>
|
|
|
This is primarily applicable to the POWER SIMD instructions. As
|
|
|
we've seen, this set of instructions operates on groups of 2, 4,
|
|
|
8, or 16 vector elements at a time in 128-bit registers. On a
|
|
|
big-endian POWER platform, vector elements are loaded from memory
|
|
|
into a register so that the 0th element occupies the high-order
|
|
|
bits of the register, and the (N – 1)th element occupies the
|
|
|
low-order bits of the register. This is referred to as big-endian
|
|
|
element order. On a little-endian POWER platform, vector elements
|
|
|
are loaded from memory such that the 0th element occupies the
|
|
|
low-order bits of the register, and the (N – 1)th element
|
|
|
occupies the high-order bits. This is referred to as little-endian
|
|
|
element order.
|
|
|
</para>
|
|
|
|
|
|
<note>
|
|
|
<para>
|
|
|
Much of the information in this chapter was formerly part of
|
|
|
Chapter 6 of the 64-Bit ELF V2 ABI Specification for POWER.
|
|
|
</para>
|
|
|
</note>
|
|
|
|
|
|
<section>
|
|
|
<title>Vector Data Types</title>
|
|
|
<para>
|
|
|
Languages provide support for the data types in <xref
|
|
|
linkend="VIPR.biendian.vectypes" /> to represent vector data
|
|
|
types stored in vector registers.
|
|
|
</para>
|
|
|
<para>
|
|
|
For the C and C++ programming languages (and related/derived
|
|
|
languages), these data types may be accessed based on the type
|
|
|
names listed in <xref linkend="VIPR.biendian.vectypes" /> when
|
|
|
Power ISA SIMD language extensions are enabled using either the
|
|
|
<code>vector</code> or <code>__vector</code> keywords. [FIXME:
|
|
|
We haven't talked about these at all. Need to borrow some
|
|
|
description from the AltiVec PIM about the usage of vector,
|
|
|
bool, and pixel, and supplement with the problems this causes
|
|
|
with strict-ANSI C++. Maybe a separate section on "Language
|
|
|
Elements" should precede this one.]
|
|
|
</para>
|
|
|
<para>
|
|
|
For the Fortran language, <xref
|
|
|
linkend="VIPR.biendian.fortran-types" /> gives a correspondence
|
|
|
between Fortran and C/C++ language types.
|
|
|
</para>
|
|
|
<para>
|
|
|
The assignment operator always performs a byte-by-byte data copy
|
|
|
for vector data types.
|
|
|
</para>
|
|
|
<para>
|
|
|
Like other C/C++ language types, vector types may be defined to
|
|
|
have const or volatile properties. Vector data types can be
|
|
|
defined as being in static, auto, and register storage.
|
|
|
</para>
|
|
|
<para>
|
|
|
Pointers to vector types are defined like pointers of other
|
|
|
C/C++ types. Pointers to vector objects may be defined to have
|
|
|
const and volatile properties. Pointers to vector objects must
|
|
|
be divisible by 16, as vector objects are always aligned on
|
|
|
quadword (128-bit) boundaries.
|
|
|
</para>
|
|
|
<para>
|
|
|
The preferred way to access vectors at an application-defined
|
|
|
address is by using vector pointers and the C/C++ dereference
|
|
|
operator <code>*</code>. Similar to other C/C++ data types, the
|
|
|
array reference operator <code>[]</code> may be used to access
|
|
|
vector objects with a vector pointer with the usual definition
|
|
|
to access the <emphasis>n</emphasis>th vector element from a
|
|
|
vector pointer. The dereference operator <code>*</code> may
|
|
|
<emphasis>not</emphasis> be used to access data that is not
|
|
|
aligned at least to a quadword boundary. Built-in functions
|
|
|
such as <code>vec_xl</code> and <code>vec_xst</code> are
|
|
|
provided for unaligned data access.
|
|
|
</para>
|
|
|
<para>
|
|
|
Compilers are expected to recognize and optimize multiple
|
|
|
operations that can be optimized into a single hardware
|
|
|
instruction. For example, a load and splat hardware instruction
|
|
|
might be generated for the following sequence:
|
|
|
</para>
|
|
|
<programlisting>double *double_ptr;
|
|
|
register vector double vd = vec_splats(*double_ptr);</programlisting>
|
|
|
<table frame="all" pgwide="1" xml:id="VIPR.biendian.vectypes">
|
|
|
<title>Vector Types</title>
|
|
|
<tgroup cols="4">
|
|
|
<colspec colname="c1" colwidth="20*" />
|
|
|
<colspec colname="c2" colwidth="10*" align="center" />
|
|
|
<colspec colname="c3" colwidth="15*" align="center" />
|
|
|
<colspec colname="c4" colwidth="40*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Power SIMD C Types</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">sizeof</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Alignment</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Description</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector unsigned char</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 16 unsigned bytes.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector signed char</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 16 signed bytes.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector bool char</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 16 bytes with a value of either 0 or
|
|
|
2<superscript>8</superscript> – 1.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector unsigned short</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 8 unsigned halfwords.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector signed short</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 8 signed halfwords.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector bool short</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 8 halfwords with a value of either 0 or
|
|
|
2<superscript>16</superscript> – 1.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector unsigned int</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 4 unsigned words.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector signed int</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 4 signed words.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector bool int</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 4 words with a value of either 0 or
|
|
|
2<superscript>32</superscript> – 1.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector unsigned long<footnote xml:id="vlong">
|
|
|
<para>The vector long types are deprecated due to their
|
|
|
ambiguity between 32-bit and 64-bit environments. The use
|
|
|
of the vector long long types is preferred.</para>
|
|
|
</footnote></para>
|
|
|
<para>vector unsigned long long</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 2 unsigned doublewords.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector signed long<footnoteref linkend="vlong" /></para>
|
|
|
<para>vector signed long long</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 2 signed doublewords.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector bool long<footnoteref linkend="vlong" /></para>
|
|
|
<para>vector bool long long</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 2 doublewords with a value of either 0 or
|
|
|
2<superscript>64</superscript> – 1.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector unsigned __int128</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 1 unsigned quadword.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector signed __int128</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 1 signed quadword.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector _Float16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 8 half-precision floats.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector float</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 4 single-precision floats.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vector double</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>16</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Quadword</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector of 2 double-precision floats.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
</section>
|
|
|
|
|
|
<section>
|
|
|
<title>Vector Operators</title>
|
|
|
<para>
|
|
|
In addition to the dereference and assignment operators, the
|
|
|
Power SIMD Vector Programming API [FIXME: If we're going to use
|
|
|
a term like this, let's use it consistently; also, SIMD and
|
|
|
Vector are redundant] provides the usual operators that are
|
|
|
valid on pointers; these operators are also valid for pointers
|
|
|
to vector types.
|
|
|
</para>
|
|
|
<para>
|
|
|
The traditional C/C++ operators are defined on vector types
|
|
|
with “do all” semantics for unary and binary <code>+</code>,
|
|
|
unary and binary –, binary <code>*</code>, binary
|
|
|
<code>%</code>, and binary <code>/</code> as well as the unary
|
|
|
and binary shift, logical and comparison operators, and the
|
|
|
ternary <code>?:</code> operator.
|
|
|
</para>
|
|
|
<para>
|
|
|
For unary operators, the specified operation is performed on
|
|
|
the corresponding base element of the single operand to derive
|
|
|
the result value for each vector element of the vector
|
|
|
result. The result type of unary operations is the type of the
|
|
|
single input operand.
|
|
|
</para>
|
|
|
<para>
|
|
|
For binary operators, the specified operation is performed on
|
|
|
the corresponding base elements of both operands to derive the
|
|
|
result value for each vector element of the vector
|
|
|
result. Both operands of the binary operators must have the
|
|
|
same vector type with the same base element type. The result
|
|
|
of binary operators is the same type as the type of the input
|
|
|
operands.
|
|
|
</para>
|
|
|
<para>
|
|
|
Further, the array reference operator may be applied to vector
|
|
|
data types, yielding an l-value corresponding to the specified
|
|
|
element in accordance with the vector element numbering rules (see
|
|
|
<xref linkend="VIPR.biendian.layout" />). An l-value may either
|
|
|
be assigned a new value or accessed for reading its value.
|
|
|
</para>
|
|
|
</section>
|
|
|
|
|
|
<section xml:id="VIPR.biendian.layout">
|
|
|
<title>Vector Layout and Element Numbering</title>
|
|
|
<para>
|
|
|
Vector data types consist of a homogeneous sequence of elements
|
|
|
of the base data type specified in the vector data
|
|
|
type. Individual elements of a vector can be addressed by a
|
|
|
vector element number. Element numbers can be established either
|
|
|
by counting from the “left” of a register and assigning the
|
|
|
left-most element the element number 0, or from the “right” of
|
|
|
the register and assigning the right-most element the element
|
|
|
number 0.
|
|
|
</para>
|
|
|
<para>
|
|
|
In big-endian environments, establishing element counts from the
|
|
|
left makes the element stored at the lowest memory address the
|
|
|
lowest-numbered element. Thus, when vectors and arrays of a
|
|
|
given base data type are overlaid, vector element 0 corresponds
|
|
|
to array element 0, vector element 1 corresponds to array
|
|
|
element 1, and so forth.
|
|
|
</para>
|
|
|
<para>
|
|
|
In little-endian environments, establishing element counts from
|
|
|
the right makes the element stored at the lowest memory address
|
|
|
the lowest-numbered element. Thus, when vectors and arrays of a
|
|
|
given base data type are overlaid, vector element 0 will
|
|
|
correspond to array element 0, vector element 1 will correspond
|
|
|
to array element 1, and so forth.
|
|
|
</para>
|
|
|
<para>
|
|
|
Consequently, the vector numbering schemes can be described as
|
|
|
big-endian and little-endian vector layouts and vector element
|
|
|
numberings.
|
|
|
</para>
|
|
|
<para>
|
|
|
This element numbering shall also be used by the <code>[]</code>
|
|
|
accessor method to vector elements provided as an extension of
|
|
|
the C/C++ languages by some compilers, as well as for other
|
|
|
language extensions or library constructs that directly or
|
|
|
indirectly refer to elements by their element number.
|
|
|
</para>
|
|
|
<para>
|
|
|
Application programs may query the vector element ordering in
|
|
|
use by testing the __VEC_ELEMENT_REG_ORDER__ macro. This macro
|
|
|
has two possible values:
|
|
|
</para>
|
|
|
<informaltable frame="none" rowsep="0" colsep="0">
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="40*" />
|
|
|
<colspec colname="c2" colwidth="60*" />
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>__ORDER_LITTLE_ENDIAN__</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector elements use little-endian element ordering.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>__ORDER_BIG_ENDIAN__</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector elements use big-endian element ordering.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</informaltable>
|
|
|
</section>
|
|
|
|
|
|
<section>
|
|
|
<title>Vector Built-In Functions</title>
|
|
|
<para>
|
|
|
Some of the POWER SIMD hardware instructions refer, implicitly
|
|
|
or explicitly, to vector element numbers. For example, the
|
|
|
<code>vspltb</code> instruction has as one of its inputs an
|
|
|
index into a vector. The element at that index position is to
|
|
|
be replicated in every element of the output vector. For
|
|
|
another example, <code>vmuleuh</code> instruction operates on
|
|
|
the even-numbered elements of its input vectors. The hardware
|
|
|
instructions define these element numbers using big-endian
|
|
|
element order, even when the machine is running in little-endian
|
|
|
mode. Thus, a built-in function that maps directly to the
|
|
|
underlying hardware instruction, regardless of the target
|
|
|
endianness, has the potential to confuse programmers on
|
|
|
little-endian platforms.
|
|
|
</para>
|
|
|
<para>
|
|
|
It is more useful to define built-in functions that map to these
|
|
|
instructions to use natural element order. That is, the
|
|
|
explicit or implicit element numbers specified by such built-in
|
|
|
functions should be interpreted using big-endian element order
|
|
|
on a big-endian platform, and using little-endian element order
|
|
|
on a little-endian platform.
|
|
|
</para>
|
|
|
<para>
|
|
|
The descriptions of the built-in functions in <xref
|
|
|
linkend="VIPR.vec-ref" /> contain notes on endian issues that
|
|
|
apply to each built-in function. Furthermore, a built-in
|
|
|
function requiring a different compiler implementation for
|
|
|
big-endian than it uses for little-endian has a sample
|
|
|
compiler implementation for both BE and LE. These sample
|
|
|
implementations are only intended as examples; designers of a
|
|
|
compiler are free to use other methods to implement the
|
|
|
specified semantics as they see fit.
|
|
|
</para>
|
|
|
<section>
|
|
|
<title>Extended Data Movement Functions</title>
|
|
|
<para>
|
|
|
The built-in functions in <xref
|
|
|
linkend="VIPR.biendian.vmx-mem" /> map to Altivec/VMX load and
|
|
|
store instructions and provide access to the “auto-aligning”
|
|
|
memory instructions of the VMX ISA where low-order address
|
|
|
bits are discarded before performing a memory access. These
|
|
|
instructions access load and store data in accordance with the
|
|
|
program's current endian mode, and do not need to be adapted
|
|
|
by the compiler to reflect little-endian operating during code
|
|
|
generation.
|
|
|
</para>
|
|
|
<table frame="all" pgwide="1" xml:id="VIPR.biendian.vmx-mem">
|
|
|
<title>VMX Memory Access Built-In Functions</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center" />
|
|
|
<colspec colname="c2" colwidth="35*" align="center" />
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-in Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ld</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_lde</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvebx, lvehx, lvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ldl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_st</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ste</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvebx, stvehx, stvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_stl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>
|
|
|
Previous versions of the VMX built-in functions defined
|
|
|
intrinsics to access the VMX instructions <code>lvsl</code>
|
|
|
and <code>lvsr</code>, which could be used in conjunction with
|
|
|
<code>vec_vperm</code> and VMX load and store instructions for
|
|
|
unaligned access. The <code>vec_lvsl</code> and
|
|
|
<code>vec_lvsr</code> interfaces are deprecated in accordance
|
|
|
with the interfaces specified here. For compatibility, the
|
|
|
built-in pseudo sequences published in previous VMX documents
|
|
|
continue to work with little-endian data layout and the
|
|
|
little-endian vector layout described in this
|
|
|
document. However, the use of these sequences in new code is
|
|
|
discouraged and usually results in worse performance. It is
|
|
|
recommended (but not required) that compilers issue a warning
|
|
|
when these functions are used in little-endian
|
|
|
environments. It is recommended that programmers use the
|
|
|
<code>vec_xl</code> and <code>vec_xst</code> vector built-in
|
|
|
functions to access unaligned data streams. See the
|
|
|
descriptions of these instructions in <xref
|
|
|
linkend="VIPR.vec-ref" /> for further description and
|
|
|
implementation details.
|
|
|
</para>
|
|
|
</section>
|
|
|
<section>
|
|
|
<title>Big-Endian Vector Layout in Little-Endian Environments
|
|
|
(Deprecated)</title>
|
|
|
<para>
|
|
|
Versions 1.0 through 1.4 of the 64-Bit ELFv2 ABI Specification
|
|
|
for POWER provided for optional compiler support for using
|
|
|
big-endian element ordering in little-endian environments.
|
|
|
This was initially deemed useful for porting certain libraries
|
|
|
that assumed big-endian element ordering regardless of the
|
|
|
endianness of their input streams. In practice, this
|
|
|
introduced serious compiler complexity without much utility.
|
|
|
Thus this support (previously controlled by switches
|
|
|
<code>-maltivec=be</code> and/or <code>-qaltivec=be</code>) is
|
|
|
now deprecated. Current versions of the gcc and clang
|
|
|
open-source compilers do not implement this support.
|
|
|
</para>
|
|
|
</section>
|
|
|
</section>
|
|
|
|
|
|
<section>
|
|
|
<title>Language-Specific Vector Support for Other
|
|
|
Languages</title>
|
|
|
<section>
|
|
|
<title>Fortran</title>
|
|
|
<para>
|
|
|
<xref linkend="VIPR.biendian.fortran-types" /> shows the
|
|
|
correspondence between the C/C++ types described in this
|
|
|
document and their Fortran equivalents. In Fortran, the
|
|
|
Boolean vector data types are represented by
|
|
|
<code>VECTOR(UNSIGNED(</code><emphasis>n</emphasis><code>))</code>.
|
|
|
</para>
|
|
|
<table frame="all" pgwide="1" xml:id="VIPR.biendian.fortran-types">
|
|
|
<title>Fortran Vector Data Types</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="50*" />
|
|
|
<colspec colname="c2" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">XL Fortran Vector Type</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">XL C/C++ Vector Type</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(1))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed char</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(2))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed short</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed int</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed long long, vector signed long<footnote
|
|
|
xml:id="vlongappalling">
|
|
|
<para>The vector long types are deprecated due to their
|
|
|
ambiguity between 32-bit and 64-bit environments. The use
|
|
|
of the vector long long types is preferred.</para>
|
|
|
</footnote></para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(16))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed __int128</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(1))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned char</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(2))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned short</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned int</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned long long, vector unsigned long<footnoteref
|
|
|
linkend="vlongappalling" /></para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(16))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned __int128</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(REAL(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector float</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(REAL(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector double</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(PIXEL)</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector pixel</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>
|
|
|
Because the Fortran language does not support pointers, vector
|
|
|
built-in functions that expect pointers to a base type take an
|
|
|
array element reference to indicate the address of a memory
|
|
|
location that is the subject of a memory access built-in
|
|
|
function.
|
|
|
</para>
|
|
|
<para>
|
|
|
Because the Fortran language does not support type casts, the
|
|
|
<code>vec_convert</code> and <code>vec_concat</code> built-in
|
|
|
functions shown in <xref linkend="VIPR.endian.convert" /> are
|
|
|
provided to perform bit-exact type conversions between vector
|
|
|
types.
|
|
|
</para>
|
|
|
<table frame="all" pgwide="1" xml:id="VIPR.endian.convert">
|
|
|
<title>Built-In Vector Conversion Functions</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="30*" align="center" />
|
|
|
<colspec colname="c2" colwidth="70*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Group</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Description</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VEC_CONCAT (ARG1, ARG2)<?linebreak?>(Fortran)</para>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Purpose:</para>
|
|
|
<para>Concatenates two elements to form a vector.</para>
|
|
|
<para>Result value:</para>
|
|
|
<para>The resulting vector consists of the two scalar elements,
|
|
|
ARG1 and ARG2, assigned to elements 0 and 1 (using the
|
|
|
environment’s native endian numbering), respectively.</para>
|
|
|
<itemizedlist>
|
|
|
<listitem>
|
|
|
<para><emphasis role="bold">Note: </emphasis>This function corresponds to the C/C++ vector
|
|
|
constructor (vector type){a,b}. It is provided only for
|
|
|
languages without vector constructors.</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed long long vec_concat (signed long long,
|
|
|
signed long long);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned long long vec_concat (unsigned long long,
|
|
|
unsigned long long);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector double vec_concat (double, double);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VEC_CONVERT(V, MOLD)</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Purpose:</para>
|
|
|
<para>Converts a vector to a vector of a given type.</para>
|
|
|
<para>Class:</para>
|
|
|
<para>Pure function</para>
|
|
|
<para>Argument type and attributes:</para>
|
|
|
<itemizedlist spacing="compact">
|
|
|
<listitem>
|
|
|
<para>V Must be an INTENT(IN) vector.</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>MOLD Must be an INTENT(IN) vector. If it is a
|
|
|
variable, it need not be defined.</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
<para>Result type and attributes:</para>
|
|
|
<para>The result is a vector of the same type as MOLD.</para>
|
|
|
<para>Result value:</para>
|
|
|
<para>The result is as if it were on the left-hand side of an
|
|
|
intrinsic assignment with V on the right-hand side.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
</section>
|
|
|
</section>
|
|
|
|
|
|
<section>
|
|
|
<title>Examples</title>
|
|
|
<para>filler</para>
|
|
|
</section>
|
|
|
|
|
|
<section>
|
|
|
<title>Limitations</title>
|
|
|
<para>
|
|
|
<code>vec_sld</code>
|
|
|
</para>
|
|
|
<para>
|
|
|
<code>vec_perm</code>
|
|
|
</para>
|
|
|
</section>
|
|
|
|
|
|
</chapter>
|