图形 (C++ AMP)

项目
02/21/2013

C++ AMP在可用于访问纹理在GPUs支持的 Concurrency::graphics 命名空间包含一些API。一些常见的情况有：

可以使用纹理选件类作为数据容器执行计算和利用GPU硬件纹理缓存和布局空间的位置。空格的位置是数据元素属性实际上是关闭相互。
运行时提供有效的互操作性具有非计算着色器。像素、顶点、tesselation和船身着色器经常使用或生成您的C++ AMP计算中使用的纹理。
在C++ AMP的图像API提供替代方法来访问子Word打包的缓冲区。具有格式由8位或16位标量组成的texels 的纹理(纹理元素)允许对此类压缩数据存储的访问。

备注

C++ AMP API不提供纹理采样和筛选功能。在DirectCompute和HLSL必须使用C++ AMP然后编写互操作性功能代码。

准则和unorm类型

norm 和 unorm 类型是限制 float 一系列值的数据类型;这称为" 夹紧。这些类型可从其他数据类型显式构造。在强制转换，该值首先转换为 float 然后将夹紧到由准则的单个区域[- 1.0 1.0]或… unorm允许… [0.0 1.0]。将+/-无限返回+/- 1。将Nan是未定义的。准则可以从unorm隐式构造，而不是数据丢失。浮动隐式转换运算符在这些类型定义。二元运算符定义在这些类型和其他内置数据类型（如 float 和 int：+、-、*、/、==、!=、>、<、>=、<=。复合赋值运算符还支持：+=、-=、*=、/=。一元求反运算(-)为准则类型定义。

短的向量库

短的向量库提供一些在向量类型 HLSL中定义且通常用于定义texels的功能。短矢量采用表示相同类型的一到四个值的数据结构。支持的类型是 double、float、int、norm、uint和 unorm。该类型名称如下表所示。对于不具有该名称的一个下划线的每个类型，还具有相应的 typedef。具有下划线的类型在 Concurrency::graphics 命名空间。没有下划线的类型在 Concurrency::graphics::direct3d 命名空间，以便从类似的命名基础类型明显分开例如 __int8 和 __int16。

	长度为2	长度为3	长度为4
double	double_2 double2	double_3 double3	double_4 double4
float	float_2 float2	float_3 float3	float_4 float4
int	int_2 int2	int_3 int3	int_4 int4
准则	norm_2 norm2	norm_3 norm3	norm_4 norm4
uint	uint_2 uint2	uint_3 uint3	uint_4 uint4
unorm	unorm_2 unorm2	unorm_3 unorm3	unorm_4 unorm4

double

double_2

double2

double_3

double3

double_4

double4

float

float_2

float2

float_3

float3

float_4

float4

int

int_2

int2

int_3

int3

int_4

int4

准则

norm_2

norm2

norm_3

norm3

norm_4

norm4

uint

uint_2

uint2

uint_3

uint3

uint_4

uint4

unorm

unorm_2

unorm2

unorm_3

unorm3

unorm_4

unorm4

运算符

如果运算符已被定义两个简短的向量之间，则它还定义在短矢量图像和标量之间。此外，其中一个必须满足的:

标量的类型必须与短的向量的元素类型。
通过使用用户定义的转换，标量的类型只能隐式转换为矢量的元素类型。

操作是的一个组件短矢量的每个元素和一个标量之间。这是有效的运算符:

运算符类型	有效类型
二元运算符	有效在所有类型：+，)，*，/，有效在整数类型：%，^，\|，&，<<，>> 两个矢量必须具有相同的大小，结果是相同大小的矢量。
关系运算符	有效在所有类型：==和! =
复合赋值运算符	有效在所有类型：+=，- =，*=，/= 有效在整数类型：%=，^=，\|=，&=，<<=，>>=
增量和减量运算符	有效在所有类型：++，-- 标题和后缀有效。
按位"取非"运算符(\|)	有效在integer类型。
一元)运算符	有效在所有类型中除 unorm 和 uint。

Swizzling表达式

短的向量库支持 vector_type.identifier 访问器构造访问短矢量的元素。 identifier，即一个 swizzling的表达式，该值指定矢量的元素。该表达式可以是左值或r值。在该标识符的各个字符可能是：x，y、z和w;、、、g、b和". “ x”和“r”表示零Th元素，“y”和“g”表示第一个元素，依此类推。 (“x”和“r”不能在同一标识符。)的通知因此，“rgba”和“xyzw”返回相同的结果。单元素访问器例如“x”和“y”是标量值类型。多格式说明符的访问器较短的向量类型。例如，因此，如果构造名为 fourInts 并具有值2，4，6和8的 int_4 向量，然后 fourInts.y 返回该整数4，并 fourInts.rg 返回具有值2和4.的 int_2 对象。

纹理选件类

许多GPUs已优化获取像素和texels和呈现图形和纹理的硬件和缓存。纹理<T,N> 选件类，它是texel对象的容器选件类，显示纹理功能这些GPUs。 texel可以是：

int、uint、float、double、norm或 unorm 标量。
有两个或四个元素的较短的矢量。唯一的例外是 double_4，这是不允许的。

texture 对象可能有级别1、2或3，。 texture 对象调用lambda只能获取对 parallel_for_each。纹理在GPU存储为Direct3D纹理对象。有关纹理和texels的更多信息。Direct3d，请纹理介绍在Direct3D 11参见。

您使用的texel类型可能是用于图形编程多纹理格式之一。例如，RGBA格式可以为R "、" G A、B和标量，使用32位和8位中的每个元素。图形卡的纹理硬件可以各个元素按照以下格式的访问。例如，如果您使用的是，RGBA格式，纹理硬件可以提取每个8位组件为32位窗体。在C++ AMP，可以将此位设置每个您的texel的标量元素，以便您可以自动访问代码中的单个标量元素，而不必使用位转换。

实例化的纹理Objects

可以声明纹理对象，而不会初始化。下面的代码示例声明了纹理对象。

#include <amp.h>
#include <amp_graphics.h>
using namespace concurrency;
using namespace concurrency::graphics;

void declareTextures() {

    // Create a 16-texel texture of int. 
    texture<int, 1> intTexture1(16);  
    texture<int, 1> intTexture2(extent<1>(16)); 

    // Create a 16 x 32 texture of float_2.  
    texture<float_2, 2> floatTexture1(16, 32);  
    texture<float_2, 2> floatTexture2(extent<2>(16, 32));   

    // Create a 2 x 4 x 8 texture of uint_4. 
    texture<uint_4, 3> uintTexture1(2, 4, 8);  
    texture<uint_4, 3> uintTexture2(extent<3>(2, 4, 8));
}

还可以使用构造函数声明和初始化 texture 对象。下面的代码示例实例化 float_4 对象矢量的一 texture 对象。位每个标量元素设置为默认值。因为它们没有默认位每个标量元素，不能对 norm、unorm或 norm 和 unorm短矢量的此构造函数。

#include <amp.h>
#include <amp_graphics.h>
#include <vector>
using namespace concurrency;
using namespace concurrency::graphics;

void initializeTexture() {

    std::vector<int_4> texels;
    for (int i = 0; i < 768 * 1024; i++) {
        int_4 i4(i, i, i, i);
        texels.push_back(i4);
    }
    
texture<int_4, 2> aTexture(768, 1024, texels.begin(), texels.end());
}

还可以声明和初始化 texture 对象通过在字节采用指向源数据、源数据的大小和位每个标量组件的构造函数重载。

void createTextureWithBPC() {
    // Create the source data.
    float source[1024 * 2]; 
    for (int i = 0; i < 1024 * 2; i++) {
        source[i] = (float)i;
    }

    // Initialize the texture by using the size of source in bytes
    // and bits per scalar element.
    texture<float_2, 1> floatTexture(1024, source, (unsigned int)sizeof(source), 32U); 
}

在这些示例中的纹理在默认快捷键的默认视图创建。如果要指定 accelerator_view 对象，可以使用构造函数的其他重载。不能创建CPU快捷键的纹理对象。

如下表所示，在 texture 对象的每个维度的大小限制，。如果超出限制，则会发生运行时错误。

纹理	范围限制
纹理<T,1>	16384
纹理<T,2>	16384
纹理<T,2>	2048

读取纹理Objects

使用 texture::operator[] 运算符、texture::operator() 运算符或 texture::get 方法，可以从 texture 对象读取。 texture::operator[] 运算符和 texture::operator() 运算符返回值，而不是引用。因此，可以使用 texture::operator[] 运算符，您不能写入 texture 对象。

void readTexture() {
    std::vector<int_2> src;    
    for (int i = 0; i < 16 *32; i++) {
        int_2 i2(i, i);
        src.push_back(i2);
    }

    std::vector<int_2> dst(16 * 32);  
    array_view<int_2, 2> arr(16, 32, dst);  
    arr.discard_data(); 

    const texture<int_2, 2> tex9(16, 32, src.begin(), src.end());  
    parallel_for_each(tex9.extent, [=, &tex9] (index<2> idx) restrict(amp) {          
        // Use the subscript operator.      
        arr[idx].x += tex9[idx].x; 
        // Use the function () operator.      
        arr[idx].x += tex9(idx).x; 
        // Use the get method.
        arr[idx].y += tex9.get(idx).y; 
        // Use the function () operator.  
        arr[idx].y += tex9(idx[0], idx[1]).y; 
    });  

    arr.synchronize();
}

下面的代码示例在短向量演示如何存储纹理通道，然后访问各个标量元素作为短矢量的属性。

void UseBitsPerScalarElement() {
    // Create the image data. 
    // Each unsigned int (32-bit) represents four 8-bit scalar elements(r,g,b,a values).
    const int image_height = 16;
    const int image_width = 16;
    std::vector<unsigned int> image(image_height * image_width);

    extent<2> image_extent(image_height, image_width);

    // By using uint_4 and 8 bits per channel, each 8-bit channel in the data source is 
    // stored in one 32-bit component of a uint_4.
    texture<uint_4, 2> image_texture(image_extent, image.data(), image_extent.size() * 4U,  8U);

    // Use can access the RGBA values of the source data by using swizzling expressions of the uint_4.
    parallel_for_each(image_extent,  
         [&image_texture](index<2> idx) restrict(amp) 
    { 
        // 4 bytes are automatically extracted when reading.
        uint_4 color = image_texture[idx]; 
        unsigned int r = color.r; 
        unsigned int g = color.g; 
        unsigned int b = color.b; 
        unsigned int a = color.a; 
    });
}

下表列出了有效的位每个中的每一个通道排序向量类型。

纹理数据类型	有效的位每个标量元素
int，int_2，int_4 uint，uint_2，uint_4	8, 16, 32
浮点数，float_2，float_4	16, 32
二进制文件，double_2	64
准则，norm_2，norm_4 unorm，unorm_2，unorm，4	8, 16

写入纹理Objects

使用 texture::set 方法写入 texture 对象。纹理对象可以是只读还是可读/写的。为纹理的对象可以读取和可写，必须满足以下条件:

T只有一个标量元素。 (短矢量不允许。)
T不是 double、norm或 unorm。
texture::bits_per_scalar_element 属性为32。

如果所有三个不为true，则 texture 对象只读。前两个条件是在编译过程中检查。生成错误，如果您尝试写入 readonly 纹理对象的代码。 texture::bits_per_scalar_element 的条件检测在运行时，因此，运行时生成 unsupported_feature 异常，如果尝试写入只读的 texture 对象。

为纹理对象的下面的代码示例写入值。

void writeTexture() {
    texture<int, 1> tex1(16); 
    parallel_for_each(tex1.extent, [&tex1] (index<1> idx) restrict(amp) {    
        tex1.set(idx, 0); 
    });

}

使用writeonly_texture_view对象

writeonly_texture_view 选件类提供纹理对象的一个writeonly视图。必须由lambda表达式的值获取 writeonly_texture_view 对象。下面的代码示例使用编写的一 writeonly_texture_view 对象有两个元素的 texture 对象(int_2)。

void write2ComponentTexture() {
    texture<int_2, 1> tex4(16); 
    writeonly_texture_view<int_2, 1> wo_tv4(tex4); 
    parallel_for_each(extent<1>(16), [=] (index<1> idx) restrict(amp) {   
        wo_tv4.set(idx, int_2(1, 1)); 
    });
}

复制的纹理Objects

使用复制函数或 copy_async 功能，如下面的代码示例所示，使用能将在纹理对象之间，。

void copyHostArrayToTexture() {
    // Copy from source array to texture object by using the copy function.
    float floatSource[1024 * 2]; 
    for (int i = 0; i < 1024 * 2; i++) {
        floatSource[i] = (float)i;
}
    texture<float_2, 1> floatTexture(1024);
    copy(floatSource, (unsigned int)sizeof(floatSource), floatTexture); 

    // Copy from source array to texture object by using the copy function.
    char charSource[16 * 16]; 
    for (int i = 0; i < 16 * 16; i++) {
        charSource[i] = (char)i;
    }
    texture<int, 2> charTexture(16, 16, 8U);
    copy(charSource, (unsigned int)sizeof(charSource), charTexture); 
    // Copy from texture object to source array by using the copy function.
    copy(charTexture, charSource, (unsigned int)sizeof(charSource)); 
}

使用 texture::copy_to 方法，可以从用纹理同时复制到另一个。两纹理可以在不同的accelerator_views。当您复制到 writeonly_texture_view 对象时，数据复制到基础 texture 对象。位每个标量元素和该区域必须在源和目标 texture 对象上相同。如果要求不匹配，运行时将引发异常。

互操作性

C++ AMP运行时支持互操作性 texture<T,1> 之间 ID3D11Texture1D接口的集成，texture<T,2> 和 ID3D11Texture2D接口之间和 texture<T,3> 和 ID3D11Texture3D接口之间。 get_texture 方法采用 texture 对象并返回 IUnknown 接口。 make_texture 方法采用 IUnknown 接口和 accelerator_view 对象并返回 texture 对象。