Benefits of Using Intrinsics
Microsoft Specific
The major benefit of using intrinsics is that you now have access to key features that are not available using conventional coding practices:
New Registers Enable packed data of up to 128 bits in length for optimal SIMD processing.
New Data Types Enable packing of up to 16 elements of data in one register.
New Registers
New registers sets, a key feature, are provided by the architecture of the processors.
MMX Technology Register
The MMX technology intrinsics provide eight new registers (MM0 to MM7) that are 64 bits long (0 to 63).
Streaming SIMD Extensions (SSE) and Streaming SIMD Extensions 2 (SSE2) Instructions Registers
The SSE and the SSE2 instructions make use of yet another eight registers (XMM0 to XMM7) that are 128 bits in length.
These new data registers enable the processing of data elements in parallel. Because each register can hold more than one data element, the processor can process more than one data element simultaneously. This processing capability is also known as SIMD processing. To enable SIMD processing with the C/C++ compiler, new data types are defined to exploit the expanded size of the new registers.
Using intrinsics allows you to code with the syntax of C function calls and variables instead of with the assembly language. For each computational and data manipulation instruction in the new extension sets, there is a corresponding C intrinsic that directly implements that instruction. This frees you from managing registers and assembly programming. Further, the compiler optimizes the instruction scheduling so that your executable runs faster.
New Data Types
New C data types, representing the new registers are used as the operands to these intrinsic functions. These data types are listed in the New Data Types Available for Intrinsic Extensions table.
New Data Types Available for Intrinsic Extensions
New data type |
MMX technology |
Streaming SIMD Extensions |
Streaming SIMD Extensions 2 instructions |
---|---|---|---|
__m64 |
Yes |
Yes |
Yes |
__m128 |
Not available |
Yes |
Yes |
__m128d |
Not available |
Not available |
Yes |
__m128i |
Not available |
Not available |
Yes |
The __m64 data type
The __m64 data type is used to represent the contents of an MMX register, which is the register used by the MMX technology intrinsics. The __m64 data type can hold eight 8-bit values, four 16-bit values, two 32-bit values, or one 64-bit value.
The __m128 data types
The compiler aligns __m128 local data to 16 bytes boundaries on the stack. Global data of these types is also 16-byte aligned.
To align integer, float, or double arrays, you can use the declspec alignment.
Because the new instruction set treats the SSE registers in the same way whether you are using packed or scalar data, there is no __m32 data type to represent scalar data as you might expect. For scalar operations, you should use the __m128 objects and the scalar forms of the intrinsics; the compiler and the processor implement these operations with 32-bit memory references.
New Data Types Usage Guidelines
The new data types listed in the New Data Types Available for Intrinsic Extensions table are not basic ANSI C data types, and therefore you must observe the following usage restrictions:
Use new data types only on the left side of an assignment as a return value or as a parameter. You cannot use it with other arithmetic expressions (" + ", " ", and so on).
Use new data types as objects in aggregates, such as unions, to access the byte elements and structures. The address of an __m64 or __m128 object may be taken.
Use new data types only with the respective intrinsics described in this guide.
For complete details of the hardware instructions, see the Intel Architecture MMX Technology Programmer's Reference Manual. For descriptions of data types, see the Intel Architecture Software Developer's Manual, volume 2: Instruction Set Reference Manual.