|
|
<!--
|
|
|
Copyright (c) 2016 OpenPOWER Foundation
|
|
|
|
|
|
Licensed under the GNU Free Documentation License, Version 1.3;
|
|
|
with no Invariants Sections, with no Front-Cover Texts,
|
|
|
and with no Back-Cover Texts (the "License");
|
|
|
you may not use this file except in compliance with the License.
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
http://www.gnu.org/licenses/fdl-1.3.txt
|
|
|
|
|
|
-->
|
|
|
<chapter xmlns="http://docbook.org/ns/docbook"
|
|
|
xmlns:xl="http://www.w3.org/1999/xlink" version="5.0"
|
|
|
xml:lang="en"
|
|
|
xml:id="dbdoclet.50655244_pgfId-1095944">
|
|
|
<title>Vector Programming Interfaces</title>
|
|
|
<para revisionflag="added">
|
|
|
Earlier versions of this ABI included a description of vector
|
|
|
programming interfaces and techniques for POWER®, along with an
|
|
|
appendix enumerating the supported vector built-in functions.
|
|
|
Most of this information is not ABI, and is removed from this
|
|
|
version of the document. Instead, those interested are encouraged
|
|
|
to now refer to the <emphasis role="underline">POWER Vector
|
|
|
Intrinsics Programming Reference</emphasis>,
|
|
|
available from the OpenPOWER Foundation in their Technical
|
|
|
Resources Catalog (<link
|
|
|
xl:href="https://openpowerfoundation.org/technical/resource-catalog/"
|
|
|
/>).
|
|
|
</para>
|
|
|
<para revisionflag="deleted">To ensure portability of applications
|
|
|
optimized to exploit the SIMD
|
|
|
functions of Power ISA processors, the ELF V2 ABI defines a set of
|
|
|
functions and data types for SIMD programming. ELF V2-compliant compilers
|
|
|
will provide suitable support for these functions, preferably as built-in
|
|
|
functions that translate to one or more Power ISA instructions.</para>
|
|
|
<para revisionflag="deleted">Compilers are encouraged, but not
|
|
|
required, to provide built-in
|
|
|
functions to access individual instructions in the IBM POWER® instruction
|
|
|
set architecture. In most cases, each such built-in function should provide
|
|
|
direct access to the underlying instruction.</para>
|
|
|
<para revisionflag="deleted">However, to ease porting between
|
|
|
little-endian (LE) and big-endian
|
|
|
(BE) POWER systems, and between POWER and other platforms, it is preferable
|
|
|
that some built-in functions provide the same semantics on both LE and BE
|
|
|
POWER systems, even if this means that the built-in functions are
|
|
|
implemented with different instruction sequences for LE and BE. To achieve
|
|
|
this, vector built-in functions provide a set of functions derived from the
|
|
|
set of hardware functions provided by the Power vector SIMD instructions.
|
|
|
Unlike traditional “hardware intrinsic” built-in functions, no fixed
|
|
|
mapping exists between these built-in functions and the generated hardware
|
|
|
instruction sequence. Rather, the compiler is free to generate optimized
|
|
|
instruction sequences that implement the semantics of the program specified
|
|
|
by the programmer using these built-in functions.</para>
|
|
|
<para revisionflag="deleted">This is primarily applicable to the
|
|
|
vector facility of the POWER ISA,
|
|
|
also known as Power SIMD, consisting of the VMX (or Altivec) and VSX
|
|
|
instructions. This set of instructions operates on groups of 2, 4, 8, or 16
|
|
|
vector elements at a time in 128-bit registers. On a big-endian POWER
|
|
|
platform, vector elements are loaded from memory into a register so that
|
|
|
the 0th element occupies the high-order bits of the register, and the
|
|
|
(N – 1)th element occupies the low-order bits of the register. This is
|
|
|
referred to as big-endian element order. On a little-endian POWER platform,
|
|
|
vector elements are loaded from memory such that the 0th element occupies
|
|
|
the low-order bits of the register, and the (N – 1)th element occupies the
|
|
|
high-order bits. This is referred to as little-endian element order.</para>
|
|
|
<section xml:id="dbdoclet.50655244_39970" revisionflag="deleted">
|
|
|
<title>Vector Data Types</title>
|
|
|
<para>Languages provide support for the data types in
|
|
|
<xref linkend="dbdoclet.50655240_89351" /> to represent vector data types
|
|
|
stored in vector registers.</para>
|
|
|
<para>For the C and C++ programming languages (and related/derived
|
|
|
languages), these data types may be accessed based on the type names listed
|
|
|
in
|
|
|
<xref linkend="dbdoclet.50655240_89351" /> when Power ISA SIMD language
|
|
|
extensions are enabled using either the vector or __vector keywords.</para>
|
|
|
<para>For the Fortran language,
|
|
|
<xref linkend="dbdoclet.50655244_80766" /> gives a correspondence of Fortran
|
|
|
and C/C++ language types.</para>
|
|
|
<para>The assignment operator always performs a byte-by-byte data copy for
|
|
|
vector data types.</para>
|
|
|
<para>Like other C/C++ language types, vector types may be defined to have
|
|
|
const or volatile properties. Vector data types can be defined as being in
|
|
|
static, auto, and register storage.</para>
|
|
|
<para>Pointers to vector types are defined like pointers of other C/C++
|
|
|
types. Pointers to objects may be defined to have const and volatile
|
|
|
properties.</para>
|
|
|
<para>The preferred way to access vectors at an application-defined address
|
|
|
is by using vector pointers and the C/C++ dereference operator *. Similar
|
|
|
to other C /C++ data types, the array reference operator [ ] may be used to
|
|
|
access vector objects with a vector pointer with the usual definition to
|
|
|
access the n-th vector element from a vector pointer. The dereference
|
|
|
operator * may <emphasis>not</emphasis> be used to access data that is
|
|
|
not aligned at least to a quadword boundary. Built-in functions such as
|
|
|
vec_xl and vec_xst are provided for unaligned data access.</para>
|
|
|
<para>Compilers are expected to recognize and optimize multiple operations
|
|
|
that can be optimized into a single hardware instruction. For example, a
|
|
|
load and splat hardware instruction might be generated for the following
|
|
|
sequence:</para>
|
|
|
<programlisting>double *double_ptr;
|
|
|
register vector double vd = vec_splats(*double_ptr);</programlisting>
|
|
|
</section>
|
|
|
<section xml:id="dbdoclet.50655244_83520" revisionflag="deleted">
|
|
|
<title>Vector Operators</title>
|
|
|
<para>In addition to the dereference and assignment operators, the Power
|
|
|
SIMD Vector Programming API provides the usual operators that are valid on
|
|
|
pointers; these operators are also valid for pointers to vector
|
|
|
types.</para>
|
|
|
<para>The traditional C/C++ operators are defined on vector types with “do
|
|
|
all” semantics for unary and binary +, unary and binary –, binary *, binary
|
|
|
%, and binary / as well as the unary and binary shift, logical and
|
|
|
comparison operators, and the ternary ?: operator.</para>
|
|
|
<para>For unary operators, the specified operation is performed on the
|
|
|
corresponding base element of the single operand to derive the result value
|
|
|
for each vector element of the vector result. The result type of unary
|
|
|
operations is the type of the single input operand.</para>
|
|
|
<para>For binary operators, the specified operation is performed on the
|
|
|
corresponding base elements of both operands to derive the result value for
|
|
|
each vector element of the vector result. Both operands of the binary
|
|
|
operators must have the same vector type with the same base element type.
|
|
|
The result of binary operators is the same type as the type of the input
|
|
|
operands.</para>
|
|
|
<para>Further, the array reference operator may be applied to vector data
|
|
|
types, yielding an l-value corresponding to the specified element in
|
|
|
accordance with the vector element numbering rules (see
|
|
|
<xref linkend="dbdoclet.50655244_25365" />). An l-value may either be
|
|
|
assigned a new value or accessed for reading its value.</para>
|
|
|
</section>
|
|
|
<section xml:id="dbdoclet.50655244_25365" revisionflag="deleted">
|
|
|
<title>Vector Layout and Element Numbering</title>
|
|
|
<para>Vector data types consist of a homogeneous sequence of elements of
|
|
|
the base data type specified in the vector data type. Individual elements
|
|
|
of a vector can be addressed by a vector element number. Element numbers
|
|
|
can be established either by counting from the “left” of a register and
|
|
|
assigning the left-most element the element number 0, or from the “right”
|
|
|
of the register and assigning the right-most element the element number
|
|
|
0.</para>
|
|
|
<para>In big-endian environments, establishing element counts from the left
|
|
|
makes the element stored at the lowest memory address the lowest-numbered
|
|
|
element. Thus, when vectors and arrays of a given base data type are
|
|
|
overlaid, vector element 0 corresponds to array element 0, vector element 1
|
|
|
corresponds to array element 1, and so forth.</para>
|
|
|
<para>In little-endian environments, establishing element counts from the
|
|
|
right makes the element stored at the lowest memory address the
|
|
|
lowest-numbered element. Thus, when vectors and arrays of a given base data
|
|
|
type are overlaid, vector element 0 will correspond to array element 0,
|
|
|
vector element 1 will correspond to array element 1, and so forth.</para>
|
|
|
<para>Consequently, the vector numbering schemes can be described as
|
|
|
big-endian and little-endian vector layouts and vector element numberings.
|
|
|
(The term “endian” comes from the endian debates presented in
|
|
|
<citetitle>Gulliver's Travels</citetitle> by Jonathan Swift.)</para>
|
|
|
<para>For internal consistency, in the ELF V2 ABI, the default vector
|
|
|
layout and vector element ordering in big-endian environments shall be big
|
|
|
endian, and the default vector layout and vector element ordering in
|
|
|
little-endian environments shall be little endian.</para>
|
|
|
<para>This element numbering shall also be used by the [ ] accessor method
|
|
|
to vector elements provided as an extension of the C/C++ languages by some
|
|
|
compilers, as well as for other language extensions or library constructs
|
|
|
that directly or indirectly refer to elements by their element
|
|
|
number.</para>
|
|
|
<para>Application programs may query the vector element ordering in use
|
|
|
(that is, whether -qaltivec=be or -maltivec=be has been selected) by
|
|
|
testing the __VEC_ELEMENT_REG_ORDER__ macro. This macro has two possible
|
|
|
values:</para>
|
|
|
<informaltable frame="none" rowsep="0" colsep="0">
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="40*" />
|
|
|
<colspec colname="c2" colwidth="60*" />
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>__ORDER_LITTLE_ENDIAN__</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector elements use little-endian element ordering.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>__ORDER_BIG_ENDIAN__</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Vector elements use big-endian element ordering.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</informaltable>
|
|
|
</section>
|
|
|
<section xml:id="dbdoclet.50655244_90667" revisionflag="deleted">
|
|
|
<title>Vector Built-in Functions</title>
|
|
|
<para>The Power language environments provide a well-known set of built-in
|
|
|
functions for the Power SIMD instructions (including both Altivec/VMX and
|
|
|
VSX). A full description of these built-in functions is beyond the scope of
|
|
|
this ABI document. Most built-in functions are polymorphic, operating on a
|
|
|
variety of vector types (vectors of signed characters, vectors of unsigned
|
|
|
halfwords, and so forth).</para>
|
|
|
<para>Some of the Power SIMD (VMX/Altivec and/or VSX) hardware instructions
|
|
|
refer, implicitly or explicitly, to vector element numbers. For example,
|
|
|
the vspltb instruction has as one of its inputs an index into a vector. The
|
|
|
element at that index position is to be replicated in every element of the
|
|
|
output vector. For another example, the vmuleuh instruction operates on the
|
|
|
even-numbered elements of its input vectors. The hardware instructions
|
|
|
define these element numbers using big-endian element order, even when the
|
|
|
machine is running in little-endian mode. Thus, a built-in function that
|
|
|
maps directly to the underlying hardware instruction, regardless of the
|
|
|
target endianness, has the potential to confuse programmers on
|
|
|
little-endian platforms.</para>
|
|
|
<para>It is more useful to define built-in functions that map to these
|
|
|
instructions to use natural element order. That is, the explicit or
|
|
|
implicit element numbers specified by such built-in functions should be
|
|
|
interpreted using big-endian element order on a big-endian platform, and
|
|
|
using little-endian element order on a little-endian platform.</para>
|
|
|
<para>This ABI defines the following built-in functions to use natural
|
|
|
element order. The Implementation Notes column suggests possible ways to
|
|
|
implement little-endian (LE) versions of the built-in functions, although
|
|
|
designers of a compiler are free to use other methods to implement the
|
|
|
specified semantics as they see fit.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_35023">
|
|
|
<title>Endian-Sensitive Operations</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="25*" align="center" />
|
|
|
<colspec colname="c2" colwidth="30*" align="center" />
|
|
|
<colspec colname="c3" colwidth="45*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-In Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_bperm</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE unsigned long long ARGs, swap halves of ARG2 and of
|
|
|
the result.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_cntlz_lsbb</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, use vctzlsbb.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_cnttz_lsbb</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, use vclzlsbb.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_extract</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>None</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vec_extract (v, 3) is equivalent to v[3].</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_extract_fp32_from_shorth</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, extract the left four elements.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_extract_fp32_from_shortl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, extract the right four elements.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_extract4b</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, subtract the byte position from 12, and swap the
|
|
|
halves of the result.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_first_match_index</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, use vctz.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_first_match_index_or_eos</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, use vctz.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_insert</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>None</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vec_insert (x, v, 3) returns the vector v with the
|
|
|
<emphasis>third</emphasis> element modified to contain x.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_insert4b</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, subtract the byte position from 12, and swap the
|
|
|
halves of ARG1.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mergee</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmrgew</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap inputs and use vmrgow for LE. Phased in.<footnote xml:id="pgfId-1105723">
|
|
|
<para>This optional function is being phased in, and it may not
|
|
|
be available on all implementations.</para>
|
|
|
</footnote></para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mergeh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmrghb, vmrghh, vmrghw</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap inputs and use vmrglb, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mergel</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmrglb, vmrglh, vmrglw</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap inputs and use vmrghb, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mergeo</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmrgow</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap inputs and use vmrgew for LE. Phased in.<footnoteref linkend="pgfId-1105723" /> </para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mule</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmuleub, vmulesb, vmuleuh, vmulesh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Replace with vmuloub, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_mulo</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vmuloub, vmulosb, vmulouh, vmulosh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Replace with vmuleub, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_pack</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vpkuhum, vpkuwum</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap input arguments for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_packpx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vpkpx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap input arguments for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_packs</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vpkuhus, vpkshss, vpkuwus, vpkswss</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap input arguments for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_packsu</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vpkuhus, vpkshus, vpkuwus, vpkswus</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Swap input arguments for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_perm</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vperm</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, swap input arguments and complement the selection
|
|
|
vector.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_splat</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vspltb, vsplth, vspltw</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Subtract the element number from N – 1 for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_sum2s</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vsum2sws</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, swap elements 0 and 1, and elements 2 and 3, of the
|
|
|
second input argument; then swap elements 0 and 1, and elements 2
|
|
|
and 3, of the result vector.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_sums</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vsumsws</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, use element 3 in little-endian order from the
|
|
|
second input vector, and place the result in element 3 in
|
|
|
little-endian order of the result vector.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_unpackh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vupkhsb, vupkhpx, vupkhsh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use vupklsb, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_unpackl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vupklsb, vupklpx, vupklsh</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use vupkhsb, and so on, for LE.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xl_len_r</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, the bytes are loaded left justified then shifted
|
|
|
right 16 – cnt bytes or rotated left cnt bytes. Let “cnt” be the
|
|
|
number of bytes specified to be loaded by vec_xl_len_r.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xst_len_r</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para> </para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>For LE, the bytes are shifted left 16 – cnt bytes or rotated
|
|
|
right cnt bytes so they are left justified to be stored. Let
|
|
|
“cnt” be the number of bytes specified to be stored by
|
|
|
vec_xst_len_r.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para> </para>
|
|
|
<bridgehead>Extended Data Movement Functions</bridgehead>
|
|
|
<para>The built-in functions in
|
|
|
<xref linkend="dbdoclet.50655244_42521" /> map to Altivec/VMX load and
|
|
|
store instructions and provide access to the “auto-aligning” memory
|
|
|
instructions of the Altivec ISA where low-order address bits are
|
|
|
discarded before performing a memory access. These instructions access
|
|
|
load and store data in accordance with the program's current endian mode,
|
|
|
and do not need to be adapted by the compiler to reflect little-endian
|
|
|
operating during code generation:</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_42521">
|
|
|
<title>Altivec Memory Access Built-In Functions</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center" />
|
|
|
<colspec colname="c2" colwidth="35*" align="center" />
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-in Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ld</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_lde</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvebx, lvehx, lvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ldl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_st</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ste</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvebx, stvehx, stvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_stl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Hardware works as a function of endian mode.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>Previous versions of the Altivec built-in functions defined
|
|
|
intrinsics to access the Altivec instructions lvsl and lvsr, which could
|
|
|
be used in conjunction with vec_vperm and Altivec load and store
|
|
|
instructions for unaligned access. The vec_lvsl and vec_lvsr interfaces
|
|
|
are deprecated in accordance with the interfaces specified here. For
|
|
|
compatibility, the built-in pseudo sequences published in previous VMX
|
|
|
documents continue to work with little-endian data layout and the
|
|
|
little-endian vector layout described in this document. However, the use
|
|
|
of these sequences in new code is discouraged and usually results in
|
|
|
worse performance. It is recommended (but not required) that compilers
|
|
|
issue a warning when these functions are used in little-endian
|
|
|
environments. It is recommended that programmers use the vec_xl and
|
|
|
vec_xst vector built-in functions to access unaligned data
|
|
|
streams.</para>
|
|
|
<para>The built-in functions in
|
|
|
<xref linkend="dbdoclet.50655244_62451" /> provide unaligned access to
|
|
|
data in memory that is to be copied to or from a variable having vector
|
|
|
data type. Memory access built-in
|
|
|
functions that specify a vector element format (that is, the w4 and d2
|
|
|
forms) are deprecated. They will be phased out in future versions of this
|
|
|
specification because vec_xl and vec_xst provide overloaded
|
|
|
layout-specific memory access based on the specified vector data
|
|
|
type.</para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_62451">
|
|
|
<title>VSX Memory Access Built-In Functions</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center" />
|
|
|
<colspec colname="c2" colwidth="35*" align="center" />
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-in Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Little-Endian Implementation
|
|
|
Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x ; xxpermdi</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xlw4<footnote xml:id="dbdoclet.50655244_73052"><para>
|
|
|
Deprecated. The use of vector data type
|
|
|
assignment and overloaded vec_xl and vec_xst vector
|
|
|
built-in functions are preferred forms for assigning
|
|
|
vector operations. Similarly, the use of
|
|
|
<literal>__builtin_lxvd2x</literal>, <literal>__builtin_lxvw4x</literal>,
|
|
|
<literal>__builtin_stxvd2x</literal>, <literal>__builtin_stxvw4x</literal>,
|
|
|
available in some compilers, is discouraged.</para></footnote>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvw4x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x ; xxpermdi</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xld2<footnoteref linkend="dbdoclet.50655244_73052"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x ; xxpermdi</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xst</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>xxpermdi ; stxvd2x</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xstw4<footnoteref linkend="dbdoclet.50655244_73052"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvw4x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>xxpermdi ; stxvd2x</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xstd2<footnoteref linkend="dbdoclet.50655244_73052"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>xxpermdi ; stxvd2x</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>The two optional built-in vector functions in
|
|
|
<xref linkend="dbdoclet.50655244_66443" /> can be used to load and store
|
|
|
vectors with a big-endian element ordering (that is, bytes from low to
|
|
|
high memory will be loaded from left to right into a vector char
|
|
|
variable), independent of the -qaltivec=be or -maltivec=be setting. For
|
|
|
more information, see
|
|
|
<xref linkend="dbdoclet.50655244_34309" />.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_66443">
|
|
|
<title>Optional Fixed Data Layout Built-In Vector Functions</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center"/>
|
|
|
<colspec colname="c2" colwidth="35*" align="center"/>
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-in Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Little-Endian Implementation
|
|
|
Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xl_be</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use lxvd2x for vector long long; vector long,<footnote
|
|
|
xml:id="vlongbad">
|
|
|
<para>The vector long types are deprecated due to their
|
|
|
ambiguity between 32-bit and 64-bit environments. The use
|
|
|
of the vector long long types is preferred. </para>
|
|
|
</footnote> vector double.</para>
|
|
|
<para>Use lxvd2x followed by reversal of elements within each
|
|
|
doubleword for all other data types.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xst_be</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use stxvd2x for vector long long; vector long,<footnoteref
|
|
|
linkend="vlongbad" /> vector double.</para>
|
|
|
<para>Use stxvd2x following a reversal of elements within each
|
|
|
doubleword for all other data types.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>In addition to the hardware-specific vector built-in functions,
|
|
|
implementations are expected to provide the interfaces listed in
|
|
|
<xref linkend="dbdoclet.50655244_10651" />.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_10651">
|
|
|
<title>Built-In Interfaces for Inserting and Extracting Elements from a
|
|
|
Vector</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="40*" align="center"/>
|
|
|
<colspec colname="c2" colwidth="60*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-In Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_extract</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vec_extract (v, 3) is equivalent to v[3].</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_insert</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vec_insert (x, v, 3) returns the vector v with the
|
|
|
<emphasis>third</emphasis> element modified to contain x.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>Environments may provide the optional built-in vector functions
|
|
|
listed in
|
|
|
<xref linkend="dbdoclet.50655244_10811" /> to adjust for endian behavior
|
|
|
by reversing the order of elements (reve) and bytes within elements
|
|
|
(revb).</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_10811">
|
|
|
<title>Optional Built-In Functions</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="20*" />
|
|
|
<colspec colname="c2" colwidth="80*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Name</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Description</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_revb</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverses the order of bytes within elements.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_reve</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverses the order of elements.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<section xml:id="dbdoclet.50655244_34309">
|
|
|
<title>Big-Endian Vector Layout in Little-Endian Environments</title>
|
|
|
<para>Because the vector layout and element numbering cannot be
|
|
|
represented in source code in an endian-neutral manner, code originating
|
|
|
from big-endian platforms may need to be compiled on little-endian
|
|
|
platforms, or vice versa. To simplify such application porting, some
|
|
|
compilers may provide an additional bridge mode to enable a simplified
|
|
|
porting for some applications.</para>
|
|
|
<para>Note that such support only works for homogeneous data being loaded
|
|
|
into vector registers (that is, no unions or structs containing elements
|
|
|
of different sizes) and when those vectors are loaded from and stored to
|
|
|
memory with element-size-specific built-in vector memory functions of
|
|
|
<xref linkend="dbdoclet.50655244_91731" /> and
|
|
|
<xref linkend="dbdoclet.50655244_21918" />. That is because, in this
|
|
|
mode, data within each element must be adjusted for little-endian data
|
|
|
representation while providing a big-endian layout and numbering of
|
|
|
vector elements within a vector.</para>
|
|
|
<note>
|
|
|
<para>Because of the internal contradiction of big-endian
|
|
|
vector layouts and little-endian data, such an environment will have
|
|
|
intrinsic limitations for the type of functionality that may be
|
|
|
offered. However, it may provide a useful bridge in the porting of
|
|
|
code using vector built-ins between environments having different
|
|
|
data layout models.</para>
|
|
|
</note>
|
|
|
<para>Compiler designers may implement additional built-in functions or
|
|
|
other mechanisms that use big-endian element ordering in little-endian
|
|
|
mode. For example, the GCC and IBM XL compilers define the options
|
|
|
-maltivec=be and -qaltivec=be, respectively, to allow programmers to
|
|
|
specify that the built-ins will generate big-endian hardware instructions
|
|
|
directly for the corresponding big-endian sequences in little-endian
|
|
|
mode. To ensure consistent element operation in this mode, the lvx
|
|
|
instructions and related instructions are changed to maintain a
|
|
|
big-endian data layout in registers by adding appropriate permute
|
|
|
sequences as shown in
|
|
|
<xref linkend="dbdoclet.50655244_91731" />. The selected vector element
|
|
|
order is reflected in the __VEC_ELEMENT_REG_ORDER__ macro. See
|
|
|
<xref linkend="dbdoclet.50655243_page131" />.</para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_91731">
|
|
|
<title>Altivec Built-In Vector Memory Access Functions (BE Layout in LE
|
|
|
Mode)</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center"/>
|
|
|
<colspec colname="c2" colwidth="35*" align="center"/>
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-In Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">BE Vector Layout in Little-Endian Mode
|
|
|
Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ld</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm after load for LE based on
|
|
|
vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_lde</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvebx, lvehx, lvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm after load for LE based on
|
|
|
vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ldl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm after load for LE based on
|
|
|
vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_st</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm before store for LE based
|
|
|
on vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_ste</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvebx, stvehx, stvewx</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm before store for LE based
|
|
|
on vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_stl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stvxl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Reverse elements with a vperm before store for LE based
|
|
|
on vector base type.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>Access to memory instructions handling potentially unaligned
|
|
|
accesses may be accomplished by using instructions (or instruction
|
|
|
sequences) that perform little-endian load of the underlying vector data
|
|
|
type while maintaining big-endian element ordering. See
|
|
|
<xref linkend="dbdoclet.50655244_21918" />.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_21918">
|
|
|
<title>VSX Built-In Memory Access Functions (BE Layout in LE
|
|
|
Mode)</title>
|
|
|
<tgroup cols="3">
|
|
|
<colspec colname="c1" colwidth="15*" align="center"/>
|
|
|
<colspec colname="c2" colwidth="35*" align="center"/>
|
|
|
<colspec colname="c3" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Built-In Function</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Corresponding POWER
|
|
|
Instructions</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">BE Vector Layout in Little-Endian Mode
|
|
|
Implementation Notes</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xl</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use lxvd2x for vector long long; vector long,<footnote
|
|
|
xml:id="vlongawful">
|
|
|
<para>The vector long types are deprecated due to their
|
|
|
ambiguity between 32-bit and 64-bit environments. The use
|
|
|
of the vector long long types is preferred.</para>
|
|
|
</footnote> vector double.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xlw4<footnote xml:id="dbdoclet.50655244_78719">
|
|
|
<para>Deprecated. The use of vector data type
|
|
|
assignment and overloaded vec_xl and vec_xst vector
|
|
|
built-in functions are preferred forms for assigning
|
|
|
vector operations. Similarly, the use of
|
|
|
<literal>__builtin_lxvd2x</literal>,<literal> __builtin_lxvw4x</literal>,
|
|
|
<literal>__builtin_stxvd2x</literal>, <literal>__builtin_stxvw4x</literal>,
|
|
|
available in some compilers, is discouraged.</para></footnote>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvw4x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use lxvw4x for vector int; vector float.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xld2<footnoteref linkend="dbdoclet.50655244_78719"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>lxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use lxvd2x, followed by reversal of elements within each
|
|
|
doubleword, for all other data types.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xst</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use stxvd2x for vector long long; vector long,<footnoteref
|
|
|
linkend="vlongawful" /> vector double.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xstw4<footnoteref linkend="dbdoclet.50655244_78719"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvw4x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use stxvw4x for vector int; vector float.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>vec_xstd2<footnoteref linkend="dbdoclet.50655244_78719"/>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>stxvd2x</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Use stxvd2x, following a reversal of elements within each
|
|
|
doubleword, for all other data types.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<note>
|
|
|
<para>The use of -maltivec=be or -qaltivec=be in
|
|
|
little-endian mode disables the transformations described
|
|
|
in
|
|
|
<xref linkend="dbdoclet.50655244_35023" />.</para>
|
|
|
</note>
|
|
|
<para>The operation of the assignment operator is never changed by a
|
|
|
setting such as <literal>-qaltivec=be</literal> or <literal>-maltivec=be</literal>.</para>
|
|
|
</section>
|
|
|
</section>
|
|
|
<section xml:id="dbdoclet.50655244_20743" revisionflag="deleted">
|
|
|
<title>Language-Specific Vector Support for Other Languages</title>
|
|
|
<section xml:id="dbdoclet.50655244_37862">
|
|
|
<title>Fortran</title>
|
|
|
<para>
|
|
|
<xref linkend="dbdoclet.50655244_80766" /> shows the correspondence
|
|
|
between the C/C++ types described in this document and their Fortran
|
|
|
equivalents. In Fortran, the Boolean vector data types are represented by
|
|
|
VECTOR(UNSIGNED(n)).</para>
|
|
|
<para>Because the Fortran language does not support pointers, vector
|
|
|
built-in functions that expect pointers to a base type take an array
|
|
|
element reference to indicate the address of a memory location that is
|
|
|
the subject of a memory access built-in function.</para>
|
|
|
<para>Because the Fortran language does not support type casts, the
|
|
|
vec_convert and vec_concat built-in functions shown in
|
|
|
<xref linkend="dbdoclet.50655244_14722" /> are provided to perform
|
|
|
bit-exact type conversions between vector types.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_14722">
|
|
|
<title>Built-In Vector Conversion Functions</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="30*" align="center" />
|
|
|
<colspec colname="c2" colwidth="70*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>
|
|
|
<emphasis role="bold">Group</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">Description</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VEC_CONCAT (ARG1, ARG2)<?linebreak?>(Fortran)</para>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Purpose:</para>
|
|
|
<para>Concatenates two elements to form a vector.</para>
|
|
|
<para>Result value:</para>
|
|
|
<para>The resulting vector consists of the two scalar elements,
|
|
|
ARG1 and ARG2, assigned to elements 0 and 1 (using the
|
|
|
environment’s native endian numbering), respectively.</para>
|
|
|
<itemizedlist>
|
|
|
<listitem>
|
|
|
<para><emphasis role="bold">Note: </emphasis>This function corresponds to the C/C++ vector
|
|
|
constructor (vector type){a,b}. It is provided only for
|
|
|
languages without vector constructors.</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed long long vec_concat (signed long long,
|
|
|
signed long long);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned long long vec_concat (unsigned long long,
|
|
|
unsigned long long);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para></para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector double vec_concat (double, double);</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VEC_CONVERT(V, MOLD)</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>Purpose:</para>
|
|
|
<para>Converts a vector to a vector of a given type.</para>
|
|
|
<para>Class:</para>
|
|
|
<para>Pure function</para>
|
|
|
<para>Argument type and attributes:</para>
|
|
|
<itemizedlist spacing="compact">
|
|
|
<listitem>
|
|
|
<para>V Must be an INTENT(IN) vector.</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>MOLD Must be an INTENT(IN) vector. If it is a
|
|
|
variable, it need not be defined.</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
<para>Result type and attributes:</para>
|
|
|
<para>The result is a vector of the same type as MOLD.</para>
|
|
|
<para>Result value:</para>
|
|
|
<para>The result is as if it were on the left-hand side of an
|
|
|
intrinsic assignment with V on the right-hand side.</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
<para>
|
|
|
<xref linkend="dbdoclet.50655244_80766" /> gives a correspondence of
|
|
|
Fortran and C/C++ language types.</para>
|
|
|
<para> </para>
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655244_80766">
|
|
|
<title>Fortran Vector Data Types</title>
|
|
|
<tgroup cols="2">
|
|
|
<colspec colname="c1" colwidth="50*" />
|
|
|
<colspec colname="c2" colwidth="50*" />
|
|
|
<thead>
|
|
|
<row>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">XL Fortran Vector Type</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
<entry align="center">
|
|
|
<para>
|
|
|
<emphasis role="bold">XL C/C++ Vector Type</emphasis>
|
|
|
</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</thead>
|
|
|
<tbody>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(1))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed char</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(2))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed short</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed int</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed long long, vector signed long<footnote
|
|
|
xml:id="vlongappalling">
|
|
|
<para>The vector long types are deprecated due to their
|
|
|
ambiguity between 32-bit and 64-bit environments. The use
|
|
|
of the vector long long types is preferred.</para>
|
|
|
</footnote></para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(INTEGER(16))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector signed __int128</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(1))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned char</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(2))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned short</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned int</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned long long, vector unsigned long<footnoteref
|
|
|
linkend="vlongappalling" /></para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(UNSIGNED(16))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector unsigned __int128</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(REAL(4))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector float</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(REAL(8))</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector double</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
<row>
|
|
|
<entry>
|
|
|
<para>VECTOR(PIXEL)</para>
|
|
|
</entry>
|
|
|
<entry>
|
|
|
<para>vector pixel</para>
|
|
|
</entry>
|
|
|
</row>
|
|
|
</tbody>
|
|
|
</tgroup>
|
|
|
</table>
|
|
|
</section>
|
|
|
</section>
|
|
|
<section>
|
|
|
<title>Library Interfaces</title>
|
|
|
<section>
|
|
|
<title>printf and scanf of Vector Data Types</title>
|
|
|
<para>Support for vector variable input and output
|
|
|
<emphasis>may</emphasis> be provided as an extension to the following
|
|
|
POSIX library functions for the new vector conversion format
|
|
|
strings:</para>
|
|
|
<itemizedlist spacing="compact">
|
|
|
<listitem>
|
|
|
<para>scanf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>fscanf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>sscanf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>wsscanf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>printf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>fprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>sprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>snprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>wsprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vfprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vsprintf</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vwsprintf</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
<para>(One sample implementation for such an extended specification is
|
|
|
libvecprintf.)</para>
|
|
|
<para>The size formatters are as follows:</para>
|
|
|
<itemizedlist>
|
|
|
<listitem>
|
|
|
<para>vl or lv consumes one argument and modifies an existing integer
|
|
|
conversion, resulting in vector signed int, vector unsigned int, or
|
|
|
vector bool for output conversions or vector signed int * or vector
|
|
|
unsigned int * for input conversions. The data is then treated as a
|
|
|
series of four 4-byte components, with the subsequent conversion
|
|
|
format applied to each.</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vh or hv consumes one argument and modifies an existing short
|
|
|
integer conversion, resulting in vector signed short or vector
|
|
|
unsigned short for output conversions or vector signed short * or
|
|
|
vector unsigned short * for input conversions. The data is treated as
|
|
|
a series of eight 2-byte components, with the subsequent conversion
|
|
|
format applied to each.</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>v consumes one argument and modifies a 1-byte integer, 1-byte
|
|
|
character, or 4-byte floating-point conversion. If the conversion is
|
|
|
a floating-point conversion, the result is vector float for output
|
|
|
conversion or vector float * for input conversion. The data is
|
|
|
treated as a series of four 4-byte floating-point components with the
|
|
|
subsequent conversion format applied to each. If the conversion is an
|
|
|
integer or character conversion, the result is either vector signed
|
|
|
char, vector unsigned char, or vector bool char for output
|
|
|
conversion, or vector signed char * or vector unsigned char * for
|
|
|
input conversions. The data is treated as a series of sixteen 1-byte
|
|
|
components, with the subsequent conversion format applied to
|
|
|
each.</para>
|
|
|
</listitem>
|
|
|
<listitem>
|
|
|
<para>vv consumes one argument and modifies an 8-byte floating-point
|
|
|
conversion. If the conversion is a floating-point conversion, the
|
|
|
result is vector double for output conversion or vector double * for
|
|
|
input conversion. The data is treated as a series of two 8-byte
|
|
|
floating-point components with the subsequent conversion format
|
|
|
applied to each. Integer and byte conversions are not defined for the
|
|
|
vv modifier.</para>
|
|
|
</listitem>
|
|
|
</itemizedlist>
|
|
|
<note>
|
|
|
<para>As new vector types are defined, new format codes should
|
|
|
be defined to support scanf and printf of those types.</para>
|
|
|
</note>
|
|
|
<para>Any conversion format that can be applied to the singular form of a
|
|
|
vector-data type can be used with a vector form. The %d, %x, %X, %u, %i,
|
|
|
and %o integer conversions can be applied with the %lv, %vl, %hv, %vh,
|
|
|
and %v vector-length qualifiers. The %c character conversion can be
|
|
|
applied with the %v vector length qualifier. The %a, %A, %e, %E, %f, %F,
|
|
|
%g, and %G float conversions can be applied with the %v vector length
|
|
|
qualifier.</para>
|
|
|
<para>For input conversions, an optional separator character can be
|
|
|
specified excluding white space preceding the separator. If no separator
|
|
|
is specified, the default separator is a space including white space
|
|
|
characters preceding the separator, unless the conversion is c. Then, the
|
|
|
default conversion is null.</para>
|
|
|
<para>For output conversions, an optional separator character can be
|
|
|
specified immediately preceding the vector size conversion. If no
|
|
|
separator is specified, the default separator is a space unless the
|
|
|
conversion is c. Then, the default separator is null.</para>
|
|
|
<para> </para>
|
|
|
</section>
|
|
|
</section>
|
|
|
</chapter>
|