Getting started (DirectXMath)

The DirectXMath Library implements an optimal and portable interface for arithmetic and linear algebra operations on single-precision floating-point vectors (2D, 3D, and 4D) or matrices (3×3 and 4×4). The library has some limited support for integer vector operations. These operations are used extensively in rendering and animation by graphics programs. There is no support for double-precision vectors (including longs, shorts, or bytes), and only limited integer vector operations.

The library is available on a variety of Windows platforms. Because the library provides functionality not available previously, this version supersedes the following libraries:

  • Xbox Math library provided by the Xboxmath.h header
  • D3DX 9 library provided by the D3DX 9 DLLs
  • D3DX 10 math library provided through the D3DX 10 DLLs
  • XNA Math library provided by the xnamath.h header in the DirectX SDK and Xbox 360 XDK

These sections outline the basics of getting started.


The DirectXMath library is included in the Windows SDK. Alternatively you can download it from GitHub/Microsoft/DirectXMath. This site also contains related sample projects.

Run-Time System Requirements

The DirectXMath Library uses specialized processor instructions for vector operations when they are available. To avoid having a program generate "unknown instruction exception" faults, check for processor support by calling XMVerifyCPUSupport before using the DirectXMath Library.

These are the basic DirectXMath Library run-time support requirements:

  • Default compilation on a Windows (x86/x64) platform requires SSE/SSE2 instruction support.
  • Default compliation on a Windows RT platform requires ARM-NEON instruction support.
  • Compilation with _XM_NO_INTRINSICS_ defined requires only standard floating-point operation support.


When you call XMVerifyCPUSupport, include <windows.h> before you include <DirectXMath.h>. This is the only function in the library that requires any content from <windows.h> so you aren't required to include <windows.h> in every module that uses <DirectXMath.h>.


Design Overview

The DirectXMath Library primarily supports the C++ programming language. The library is implemented using inline routines in the header files, DirectXMath*.inl, DirectXPackedVector.inl and DirectXCollision.inl. This implementation makes use of high-performance compiler intrinsics.

The DirectXMath Library provides:

  • An implementation using SSE/SSE2 intrinsics.
  • An implementation without intrinsics.
  • An implementation using ARM-NEON intrinsics.

Because the library is delivered using header files, use the source code to customize and optimize for your own app.

Matrix convention

DirectXMath uses row-major matrices, row vectors, and pre-multiplication. Handedness is determined by which function version is used (RH vs. LH), otherwise the function works with either left-handed or right-handed view coordinates.

For reference, Direct3D has historically used left-handed coordinate system, row-major matrices, row vectors, and pre-multiplication. Modern Direct3D does not have a strong requirement for left vs. right-handed coordinates, and typically HLSL shaders default to consuming column-major matrices. See HLSL Matrix Ordering for details.

Basic Usage

To use DirectXMath Library functions, include the DirectXMath.h, DirectXPackedVector.h, DirectXColors.h, and/or DirectXCollision.h headers. The headers are found in the Windows Software Development Kit for Windows Store apps.

Type Usage Guidelines

The XMVECTOR and XMMATRIX types are the work horses for the DirectXMath Library. Every operation consumes or produces data of these types. Working with them is key to using the library. However, since DirectXMath makes use of the SIMD instruction sets, these data types are subject to a number of restrictions. It is critical that you understand these restrictions if you want to make good use of the DirectXMath functions.

You should think of XMVECTOR as a proxy for a SIMD hardware register, and XMMATRIX as a proxy for a logical grouping of four SIMD hardware registers. These types are annotated to indicate they require 16-byte alignment to work correctly. The compiler will automatically place them correctly on the stack when they are used as a local variable, or place them in the data segment when they are used as a global variable. With proper conventions, they can also be passed safely as parameters to a function (see Calling Conventions for details).

Allocations from the heap, however, are more complicated. As such, you need to be careful whenever you use either XMVECTOR or XMMATRIX as a member of a class or structure to be allocated from the heap. On Windows x64, all heap allocations are 16-byte aligned, but for Windows x86, they are only 8-byte aligned. There are options for allocating structures from the heap with 16-byte alignment (see Properly Align Allocations). For C++ programs, you can use operator new/delete/new[]/delete[] overloads (either globally or class-specific) to enforce optimal alignment if desired.


As an alternative to enforcing alignment in your C++ class directly by overloading new/delete, you can use the pImpl idiom. If you ensure your Impl class is aligned via _aligned_malloc internally, you can then freely use aligned types within the internal implementation. This is a good option when the 'public' class is a Windows Runtime ref class or intended for use with std::shared_ptr<>, which can otherwise disrupt careful alignment.


However, often it is easier and more compact to avoid using XMVECTOR or XMMATRIX directly in a class or structure. Instead, make use of the XMFLOAT3, XMFLOAT4, XMFLOAT4X3, XMFLOAT4X4, and so on, as members of your structure. Further, you can use the Vector Loading and Vector Storage functions to move the data efficiently into XMVECTOR or XMMATRIX local variables, perform computations, and store the results. There are also streaming functions (XMVector3TransformStream, XMVector4TransformStream, and so on) that efficiently operate directly on arrays of these data types.

Creating Vectors


Many operations require the use of constants in vector computations, and there are a number of ways to load an XMVECTOR with the desired values.

  • If loading a scalar constant into all elements of an XMVECTOR, use XMVectorReplicate or XMVectorReplicateInt.

    XMVECTOR vFive = XMVectorReplicate( 5.f );
  • If using a vector constant with different fixed values as an XMVECTOR, use the XMVECTORF32, XMVECTORU32, XMVECTORI32, or XMVECTORU8 structures. These can be then referenced directly anywhere you would pass an XMVECTOR value.

    static const XMVECTORF32 vFactors = { 1.0f, 2.0f, 3.0f, 4.0f };


    Do not use initializer lists directly with XMVECTOR (that is, XMVECTOR v = { 1.0f, 2.0f, 3.0f, 4.0f }). Such code is inefficient and is not portable across all platforms that are supported by DirectXMath.


  • DirectXMath includes a number of pre-defined global constants you can make use of in your code (g_XMOne, g_XMOne3, g_XMTwo, g_XMOneHalf, g_XMHalfPi, g_XMPi, and so on). Search the DirectXMath.h header for the XMGLOBALCONST values.

  • There are a set of vector constants for common RGB colors (Red, Green, Blue, Yellow, and so on). For more info about these vector constants, see DirectXColors.h and the DirectX::Colors namespace.



  • If creating a vector from another vector with a specific component set to a variable, you can consider using Vector Accessor Functions.

    XMVECTOR v2 = XMVectorSetW( v1, fw );
  • If creating a vector from another vector with a single component replicated, use XMVectorSplatX, XMVectorSplatY, XMVectorSplatZ, and XMVectorSplatW.

    XMVECTOR vz = XMVectorSplatZ( v );
  • If creating a vector from another vector or pair of vectors with reordered components, see XMVectorSwizzle and XMVectorPermute.

    XMVECTOR v3 = XMVectorPermute<XM_PERMUTE_0W, XM_PERMUTE_1X, XM_PERMUTE_0X, XM_PERMUTE_1Z>( v1, v2 );


Extracting Components from Vectors

SIMD processing is most efficient when data is loaded into the SIMD registers and fully processed before extracting the results. Conversion between scalar and vector forms is inefficient, so we recommend that you do it only when required. For this reason, functions in the DirectXMath library that produce a scalar value are returned in a vector form where the scalar result is replicated across the resulting vector (that is, XMVector2Dot, XMVector3Length, and so on). However, when you need scalar values, here are a few choices on how to go about it:

  • If a single scalar answer is computed, use of the Vector Accessor Functions is appropriate:

    float f = XMVectorGetX( v );
  • If multiple components of the vector are required to be extracted, consider storing the vector in a memory structure and reading it back. For example:

    XMFLOAT4A t;
    XMStoreFloat4A( &t, v );
    // t.x, t.y, t.z, and t.w can be individually accessed now
  • The most efficient form of vector processing is to use memory-to-memory streaming where the input data is loaded from memory (using Vector Load Functions), processed fully in SIMD form, and then written to memory (using Vector Store Functions).

DirectXMath Programming Guide