|
|
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
|
|
|
<!--
|
|
|
|
|
Copyright (c) 2017 OpenPOWER Foundation
|
|
|
|
|
|
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
|
you may not use this file except in compliance with the License.
|
|
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
|
limitations under the License.
|
|
|
|
|
|
|
|
|
|
-->
|
|
|
|
|
<section xmlns="http://docbook.org/ns/docbook"
|
|
|
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
|
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
|
|
|
version="5.0"
|
|
|
|
|
xml:id="sec_performance_mmx">
|
|
|
|
|
<title>Using MMX intrinsics</title>
|
|
|
|
|
|
|
|
|
|
<para>MMX was the first and oldest SIMD extension and initially filled a
|
|
|
|
|
need for wider (64-bit) integer and additional registers. This is back when
|
|
|
|
|
processors were 32-bit and 8 x 32-bit registers was starting to cramp our
|
|
|
|
|
programming style. Now 64-bit processors, larger register sets, and 128-bit (or
|
|
|
|
|
larger) vector SIMD extensions are common. There is simply no good reason to
|
|
|
|
|
write new code using the (now) very limited MMX capabilities. </para>
|
|
|
|
|
|
|
|
|
|
<para>We recommend that existing MMX codes be rewritten to use the newer
|
|
|
|
|
SSE and VMX/VSX intrinsics or using the more portable GCC builtin vector
|
|
|
|
|
support or in the case of si64 operations use C scalar code. The MMX si64
|
|
|
|
|
scalars are just (64-bit) operations on long long int types and any
|
|
|
|
|
modern C compiler can handle this type. The char / short in SIMD operations
|
|
|
|
|
should all be promoted to 128-bit SIMD operations on GCC builtin vectors. Both
|
|
|
|
|
will improve cross platform portability and performance.</para>
|
|
|
|
|
|
|
|
|
|
</section>
|
|
|
|
|
|