ops Package

CNTK core operators. Calling these operators creates nodes in the CNTK computational graph.

Packages

sequence

CNTK operators that are specialized in sequences. Calling these operators creates nodes in the CNTK computational graph.

tests

Modules

functions

CNTK function constructs. This is the core abstraction of all primitive operators in the CNTK computational graph.

Functions

abs

Computes the element-wise absolute of x:

abs(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.abs([-1, 1, -2, 3]).eval()
array([ 1.,  1.,  2.,  3.], dtype=float32)

acos

Computes the element-wise arccos (inverse cosine) of x:

The output tensor has the same shape as x.

acos(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.acos([[1,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 0.     ,  1.0472 ],
       [ 1.82348,  2.41886]], dtype=float32)

alias

Create a new Function instance which just aliases the specified 'x' Function/Variable such that the 'Output' of the new 'Function' is same as the 'Output' of the specified 'x' Function/Variable, and has the newly specified name. The purpose of this operator is to create a new distinct reference to a symbolic computation which is different from the original Function/Variable that it aliases and can be used for e.g. to substitute a specific instance of the aliased Function/Variable in the computation graph instead of substituting all usages of the aliased Function/Variable.

alias(x, name='')

Parameters

operand

The Function/Variable to alias

name
<xref:str>, <xref:optional>

the name of the Alias Function in the network

Returns

cntk.ops.functions.Function

argmax

Computes the argmax of the input tensor's elements across the specified axis. If no axis is specified, it will return the flatten index of the largest element in tensor x.

argmax(x, axis=None, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

axis
<xref:int> or Axis

axis along which the reduction will be performed

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> # create 3x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = [[10, 20],[30, 40],[50, 60]]

>>> C.argmax(data, 0).eval()
array([[ 2.,  2.]], dtype=float32)

>>> C.argmax(data, 1).eval()
array([[ 1.],
       [ 1.],
       [ 1.]], dtype=float32)

argmin

Computes the argmin of the input tensor's elements across the specified axis. If no axis is specified, it will return the flatten index of the smallest element in tensor x.

argmin(x, axis=None, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

axis
<xref:int> or Axis

axis along which the reduction will be performed

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> # create 3x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = [[10, 30],[40, 20],[60, 50]]

>>> C.argmin(data, 0).eval()
array([[ 0.,  1.]], dtype=float32)

>>> C.argmin(data, 1).eval()
array([[ 0.],
       [ 1.],
       [ 1.]], dtype=float32)

as_block

Create a new block Function instance which just encapsulates the specified composite Function to create a new Function that appears to be a primitive. All the arguments of the composite being encapsulated must be Placeholder variables. The purpose of block Functions is to enable creation of hierarchical Function graphs where details of implementing certain building block operations can be encapsulated away such that the actual structure of the block's implementation is not inlined into the parent graph where the block is used, and instead the block just appears as an opaque primitive. Users still have the ability to peek at the underlying Function graph that implements the actual block Function.

as_block(composite, block_arguments_map, block_op_name, block_instance_name='')

Parameters

composite

The composite Function that the block encapsulates

block_arguments_map

A list of tuples, mapping from block's underlying composite's arguments to actual variables they are connected to

block_op_name

Name of the op that the block represents

block_instance_name
<xref:str>, <xref:optional>

the name of the block Function in the network

Returns

cntk.ops.functions.Function

as_composite

Creates a composite Function that has the specified root_function as its root. The composite denotes a higher-level Function encapsulating the entire graph of Functions underlying the specified rootFunction.

as_composite(root_function, name='')

Parameters

root_function

Root Function, the graph underlying which, the newly created composite encapsulates

name
<xref:str>, <xref:optional>

the name of the Alias Function in the network

Returns

cntk.ops.functions.Function

asin

Computes the element-wise arcsin (inverse sine) of x:

The output tensor has the same shape as x.

asin(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.asin([[1,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 1.5708 ,  0.5236 ],
       [-0.25268, -0.84806]], dtype=float32)

asinh

Computes the element-wise asinh of x:

The output tensor has the same shape as x.

asinh(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.asinh([[1,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 0.88137,  0.48121],
       [-0.24747, -0.69315]], dtype=float32)

assign

Assign the value in input to ref and return the new value, ref need to be the same layout as input. Both ref and input can't have dynamic axis and broadcast isn't supported for the assign operator. During forward pass, ref will get the new value after the forward or backward pass finish, so that any part of the graph that depend on ref will get the old value. To get the new value, use the one returned by the assign node. The reason for that is to make assign have a deterministic behavior.

If not computing gradients, the ref will be assigned the new value after the forward pass over the entire Function graph is complete; i.e. all uses of ref in the forward pass will use the original (pre-assignment) value of ref.

If computing gradients (training mode), the assignment to ref will happen after completing both the forward and backward passes over the entire Function graph.

The ref must be a Parameter or Constant. If the same ref is used in multiple assign operations, then the order in which the assignment happens is non-deterministic and the final value can be either of the assignments unless an order is established using a data dependence between the assignments.

assign(ref, input, name='')

Parameters

ref

class: ~cntk.variables.Constant or ~cntk.variables.Parameter.

input

class:~cntk.ops.functions.Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> dest = C.constant(shape=(3,4))
>>> data = C.parameter(shape=(3,4), init=2)
>>> C.assign(dest,data).eval()
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)
>>> dest.asarray()
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)

>>> dest = C.parameter(shape=(3,4), init=0)
>>> a = C.assign(dest, data)
>>> y = dest + data
>>> result = C.combine([y, a]).eval()
>>> result[y.output]
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)
>>> dest.asarray()
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)
>>> result = C.combine([y, a]).eval()
>>> result[y.output]
array([[ 4.,  4.,  4.,  4.],
       [ 4.,  4.,  4.,  4.],
       [ 4.,  4.,  4.,  4.]], dtype=float32)
>>> dest.asarray()
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)

associative_multi_arg

The output of this operation is the result of an operation (plus, log_add_exp, element_times, element_max, element_min) of two or more input tensors. Broadcasting is supported.

associative_multi_arg(f)

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.plus([1, 2, 3], [4, 5, 6]).eval()
array([ 5.,  7.,  9.], dtype=float32)

>>> C.element_times([5., 10., 15., 30.], [2.]).eval()
array([ 10.,  20.,  30.,  60.], dtype=float32)

>>> C.plus([-5, -4, -3, -2, -1], [10], [3, 2, 3, 2, 3], [-13], [+42], 'multi_arg_example').eval()
array([ 37.,  37.,  39.,  39.,  41.], dtype=float32)

>>> C.element_times([5., 10., 15., 30.], [2.], [1., 2., 1., 2.]).eval()
array([  10.,   40.,   30.,  120.], dtype=float32)

>>> a = np.arange(3,dtype=np.float32)
>>> np.exp(C.log_add_exp(np.log(1+a), np.log(1+a*a)).eval())
array([ 2.,  4.,  8.], dtype=float32)

atan

Computes the element-wise arctan (inverse tangent) of x:

The output tensor has the same shape as x.

atan(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.atan([-1, 0, 1]).eval(), 5)
array([-0.78539997,  0.        ,  0.78539997], dtype=float32)

atanh

Computes the element-wise atanh of x:

The output tensor has the same shape as x.

atanh(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.atanh([[0.9,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 1.47222,  0.54931],
       [-0.25541, -0.97296]], dtype=float32)

batch_normalization

Normalizes layer outputs for every minibatch for each output (feature) independently and applies affine transformation to preserve representation of the layer.

batch_normalization(operand, scale, bias, running_mean, running_inv_std, spatial, normalization_time_constant=5000, blend_time_constant=0, epsilon=1e-05, use_cudnn_engine=False, disable_regularization=False, name='', running_count=None)

Parameters

operand

input of the batch normalization operation

scale

parameter tensor that holds the learned componentwise-scaling factors

bias

parameter tensor that holds the learned bias. scale and bias must have the same dimensions which must be equal to the input dimensions in case of spatial = False or number of output convolution feature maps in case of spatial = True

running_mean

running mean which is used during evaluation phase and might be used during training as well. You must pass a constant tensor with initial value 0 and the same dimensions as scale and bias

running_inv_std

running variance. Represented as running_mean

running_count

Denotes the total number of samples that have been used so far to compute the running_mean and running_inv_std parameters. You must pass a scalar (either rank-0 constant(val)).

spatial
<xref:bool>

flag that indicates whether to compute mean/var for each feature in a minibatch independently or, in case of convolutional layers, per future map

normalization_time_constant
<xref:float>, <xref:default 5000>

time constant for computing running average of mean and variance as a low-pass filtered version of the batch statistics.

blend_time_constant
<xref:float>, <xref:default 0>

constant for smoothing batch estimates with the running statistics

epsilon

conditioner constant added to the variance when computing the inverse standard deviation

use_cudnn_engine
<xref:bool>, <xref:default False>
name
<xref:str>, <xref:optional>

the name of the Function instance in the network

disable_regularization
<xref:bool>, <xref:default False>

turn off regularization in batch normalization

Returns

cntk.ops.functions.Function

cast

cast input to dtype, with the same shape and dynamic axes. This function currently only support forward.

cast(node_input, dtype, name='')

Parameters

node_input

class:~cntk.input_variable that needs the dtype conversion

dtype

data_type (np.float32, np.float64, np.float16): data type of the converted output

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

ceil

The output of this operation is the element wise value rounded to the smallest integer greater than or equal to the input.

ceil(arg, name='')

Parameters

arg

input tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network (optional)

Returns

cntk.ops.functions.Function

Examples


>>> C.ceil([0.2, 1.3, 4., 5.5, 0.0]).eval()
array([ 1.,  2.,  4.,  6.,  0.], dtype=float32)

>>> C.ceil([[0.6, 3.3], [1.9, 5.6]]).eval()
array([[ 1.,  4.],
       [ 2.,  6.]], dtype=float32)

clip

Computes a tensor with all of its values clipped to fall between min_value and max_value, i.e. min(max(x, min_value), max_value).

The output tensor has the same shape as x.

clip(x, min_value, max_value, name='')

Parameters

x

tensor to be clipped

min_value
<xref:float>

a scalar or a tensor which represents the minimum value to clip element values to

max_value
<xref:float>

a scalar or a tensor which represents the maximum value to clip element values to

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.clip([1., 2.1, 3.0, 4.1], 2., 4.).eval()
array([ 2. ,  2.1,  3. ,  4. ], dtype=float32)

>>> C.clip([-10., -5., 0., 5., 10.], [-5., -4., 0., 3., 5.], [5., 4., 1., 4., 9.]).eval()
array([-5., -4.,  0.,  4.,  9.], dtype=float32)

combine

Create a new Function instance which just combines the outputs of the specified list of 'operands' Functions such that the 'Outputs' of the new 'Function' are union of the 'Outputs' of each of the specified 'operands' Functions. E.g., when creating a classification model, typically the CrossEntropy loss Function and the ClassificationError Function comprise the two roots of the computation graph which can be combined to create a single Function with 2 outputs; viz. CrossEntropy loss and ClassificationError output.

combine(*operands, **kw_name)

Parameters

operands
<xref:list>

list of functions or their variables to combine

name
<xref:str>, <xref:optional>

the name of the Combine Function in the network

Returns

cntk.ops.functions.Function

Examples


>>> in1 = C.input_variable((4,))
>>> in2 = C.input_variable((4,))

>>> in1_data = np.asarray([[1., 2., 3., 4.]], np.float32)
>>> in2_data = np.asarray([[0., 5., -3., 2.]], np.float32)

>>> plus_operation = in1 + in2
>>> minus_operation = in1 - in2

>>> forward = C.combine([plus_operation, minus_operation]).eval({in1: in1_data, in2: in2_data})
>>> len(forward)
2
>>> list(forward.values()) # doctest: +SKIP
[array([[[ 1., -3.,  6.,  2.]]], dtype=float32),
 array([[[ 1.,  7.,  0.,  6.]]], dtype=float32)]
>>> x = C.input_variable((4,))
>>> _ = C.combine(x, x)
>>> _ = C.combine([x, x])
>>> _ = C.combine((x, x))
>>> _ = C.combine(C.combine(x, x), x)

constant

It creates a constant tensor initialized from a numpy array.

constant(value=None, shape=None, dtype=None, device=None, name='')

Parameters

value
<xref:scalar> or <xref:NumPy array>, <xref:optional>

a scalar initial value that would be replicated for every element in the tensor or NumPy array. If None, the tensor will be initialized uniformly random.

shape
<xref:tuple> or <xref:int>, <xref:optional>

the shape of the input tensor. If not provided, it will be inferred from value.

dtype
<xref:optional>

data type of the constant. If a NumPy array and dtype, are given, then data will be converted if needed. If none given, it will default to np.float32.

device
DeviceDescriptor

instance of DeviceDescriptor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.variables.Constant

Examples


>>> constant_data = C.constant([[1., 2.], [3., 4.], [5., 6.]])
>>> constant_data.value
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]], dtype=float32)

convolution

Computes the convolution of convolution_map (typically a tensor of learnable parameters) with operand (commonly an image or output of a previous convolution/pooling operation). This operation is used in image and language processing applications. It supports arbitrary dimensions, strides, sharing, and padding.

This function operates on input tensors with dimensions . This can be understood as a rank-n object, where each entry consists of a -dimensional vector. For example, an RGB image would have dimensions , i.e. a -sized structure, where each entry (pixel) consists of a 3-tuple.

convolution convolves the input operand with a rank tensor of (typically learnable) filters called convolution_map of shape (typically ). The first dimension, , is the number of convolution filters (i.e. the number of channels in the output). The second dimension, , must match the number of channels in the input, which can be ignored if reduction_rank is 0. The last n dimensions are the spatial extent of the filter. I.e. for each output position, a vector of dimension is computed. Hence, the total number of filter parameters is

convolution(convolution_map, operand, strides=(1,), sharing=[True], auto_padding=[True], sequential=False, dilation=(1,), reduction_rank=1, groups=1, max_temp_mem_size_in_samples=0, name='')

Parameters

convolution_map

convolution filter weights, stored as a tensor of dimensions , where must be the kernel dimensions (spatial extent of the filter).

operand

convolution input. A tensor with dimensions .

strides
<xref:tuple>, <xref:optional>

stride dimensions. If strides[i] > 1 then only pixel positions that are multiples of strides[i] are computed. For example, a stride of 2 will lead to a halving of that dimension. The first stride dimension that lines up with the number of input channels can be set to any non-zero value.

sharing
<xref:bool>

sharing flags for each input dimension

auto_padding
<xref:bool>

flags for each input dimension whether it should be padded automatically (that is, symmetrically) or not padded at all. Padding means that the convolution kernel is applied to all pixel positions, where all pixels outside the area are assumed zero ("padded with zeroes"). Without padding, the kernels are only shifted over positions where all inputs to the kernel still fall inside the area. In this case, the output dimension will be less than the input dimension. The last value that lines up with the number of input channels must be false.

dilation
<xref:tuple>, <xref:optional>

the dilation value along each axis, default 1 mean no dilation.

reduction_rank
<xref:int>, <xref:default 1>

must be 0 or 1, 0 mean no depth or channel dimension in the input and 1 mean the input has channel or depth dimension.

groups
<xref:int>, <xref:default 1>

number of groups during convolution, that controls the connections between input and output channels. Deafult value is 1, which means that all input channels are convolved to produce all output channels. A value of N would mean that the input (and output) channels are divided into N groups with the input channels in one group (say i-th input group) contributing to output channels in only one group (i-th output group). Number of input and output channels must be divisble by value of groups argument. Also, value of this argument must be strictly positive, i.e. groups > 0.

sequential
<xref:bool>, <xref:default False>

flag if convolve over sequential axis.

max_temp_mem_size_in_samples
<xref:int>

maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> img = np.reshape(np.arange(25.0, dtype = np.float32), (1, 5, 5))
>>> x = C.input_variable(img.shape)
>>> filter = np.reshape(np.array([2, -1, -1, 2], dtype = np.float32), (1, 2, 2))
>>> kernel = C.constant(value = filter)
>>> np.round(C.convolution(kernel, x, auto_padding = [False]).eval({x: [img]}),5)
array([[[[  6.,   8.,  10.,  12.],
          [ 16.,  18.,  20.,  22.],
          [ 26.,  28.,  30.,  32.],
          [ 36.,  38.,  40.,  42.]]]], dtype=float32)

convolution_transpose

Computes the transposed convolution of convolution_map (typically a tensor of learnable parameters) with operand (commonly an image or output of a previous convolution/pooling operation). This is also known as fractionally strided convolutional layers, or, deconvolution. This operation is used in image and language processing applications. It supports arbitrary dimensions, strides, sharing, and padding.

This function operates on input tensors with dimensions . This can be understood as a rank-n object, where each entry consists of a -dimensional vector. For example, an RGB image would have dimensions , i.e. a -sized structure, where each entry (pixel) consists of a 3-tuple.

convolution_transpose convolves the input operand with a rank tensor of (typically learnable) filters called convolution_map of shape (typically ). The first dimension, , must match the number of channels in the input. The second dimension, , is the number of convolution filters (i.e. the number of channels in the output). The last n dimensions are the spatial extent of the filter. I.e. for each output position, a vector of dimension is computed. Hence, the total number of filter parameters is

convolution_transpose(convolution_map, operand, strides=(1,), sharing=[True], auto_padding=[True], output_shape=None, dilation=(1,), reduction_rank=1, max_temp_mem_size_in_samples=0, name='')

Parameters

convolution_map

convolution filter weights, stored as a tensor of dimensions , where must be the kernel dimensions (spatial extent of the filter).

operand

convolution input. A tensor with dimensions .

strides
<xref:tuple>, <xref:optional>

stride dimensions. If strides[i] > 1 then only pixel positions that are multiples of strides[i] are computed. For example, a stride of 2 will lead to a halving of that dimension. The first stride dimension that lines up with the number of input channels can be set to any non-zero value.

sharing
<xref:bool>

sharing flags for each input dimension

auto_padding
<xref:bool>

flags for each input dimension whether it should be padded automatically (that is, symmetrically) or not padded at all. Padding means that the convolution kernel is applied to all pixel positions, where all pixels outside the area are assumed zero ("padded with zeroes"). Without padding, the kernels are only shifted over positions where all inputs to the kernel still fall inside the area. In this case, the output dimension will be less than the input dimension. The last value that lines up with the number of input channels must be false.

output_shape

user expected output shape after convolution transpose.

dilation
<xref:tuple>, <xref:optional>

the dilation value along each axis, default 1 mean no dilation.

reduction_rank
<xref:int>, <xref:default 1>

must be 0 or 1, 0 mean no depth or channel dimension in the input and 1 mean the input has channel or depth dimension.

max_temp_mem_size_in_samples
<xref:int>

maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> img = np.reshape(np.arange(9.0, dtype = np.float32), (1, 3, 3))
>>> x = C.input_variable(img.shape)
>>> filter = np.reshape(np.array([2, -1, -1, 2], dtype = np.float32), (1, 2, 2))
>>> kernel = C.constant(value = filter)
>>> np.round(C.convolution_transpose(kernel, x, auto_padding = [False]).eval({x: [img]}),5)
array([[[[  0.,   2.,   3.,  -2.],
          [  6.,   4.,   6.,  -1.],
          [  9.,  10.,  12.,   2.],
          [ -6.,   5.,   6.,  16.]]]], dtype=float32)

cos

Computes the element-wise cosine of x:

The output tensor has the same shape as x.

cos(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.cos(np.arccos([[1,0.5],[-0.25,-0.75]])).eval(),5)
array([[ 1.  ,  0.5 ],
       [-0.25, -0.75]], dtype=float32)

cosh

Computes the element-wise cosh of x:

The output tensor has the same shape as x.

cosh(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.cosh([[1,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 1.54308,  1.12763],
       [ 1.03141,  1.29468]], dtype=float32)

crop_automatic

Crops input along spatial dimensions so that it matches spatial size of reference input.

Crop offsets are computed by traversing the network graph and computing affine transform between the two inputs. Translation part of the transform determines the offsets. The transform is computed as composition of the transforms between each input and their common ancestor. The common ancestor is expected to exist.

crop_automatic(node_input, node_referent, name='')

Parameters

node_input

class:~cntk.ops.functions.Function that outputs the tensor to be cropped

node_referent

class:~cntk.ops.functions.Function that outputs the reference tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

crop_automatic_with_ancestors

Crops input along spatial dimensions so that it matches spatial size of reference input.

Crop offsets are computed by traversing the network graph and computing affine transform between the two inputs. Translation part of the transform determines the offsets. The transform is computed as composition of the transforms between each input and their common ancestor.

ancestor_input and ancestor_referent are expected to be ancestors of node_input and node_referent, respectively. They act like the same node for the purpose of finding a common ancestor. They are used in cases when node_input and node_referent do not have a common ancestor in the network. Typically, the ancestor nodes have the same spatial size. For example, in pixelwise semantic labeling, ancestor_input would be the input image, and ancestor_referent would be the ground truth image containing pixelwise labels.

crop_automatic_with_ancestors(node_input, node_referent, ancestor_input, ancestor_referent, name='')

Parameters

node_input

class:~cntk.ops.functions.Function that outputs the tensor to be cropped

node_referent

class:~cntk.ops.functions.Function that outputs the reference tensor

ancestor_input
<xref:optional>

class:~cntk.ops.functions.Function that outputs ancestor of node_input

ancestor_referent
<xref:optional>

class:~cntk.ops.functions.Function that outputs ancestor of node_referent

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

crop_manual

Crops input along spatial dimensions so that it matches spatial size of reference input. Crop offsets are given in pixels.

crop_manual(node_input, node_referent, offset_x, offset_y, name='')

Parameters

node_input

class:~cntk.ops.functions.Function that outputs the tensor to be cropped

node_referent

class:~cntk.ops.functions.Function that outputs the reference tensor

offset_x
<xref:int>

horizontal crop offset

offset_y
<xref:int>

vertical crop offset

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

custom_proxy_op

A proxy node that helps saving a model with different number of operands.

Example:

Args:

custom_proxy_op(custom_op, output_shape, output_data_type, *operands, **kw_name)

Returns

cntk.ops.functions.Function

depth_to_space

Rearranges elements in the input tensor from the depth dimension into spatial blocks.

This operation is useful for implementing sub-pixel convolution that is part of models for image super-resolution (see [1]). It rearranges elements of an input tensor of shape (Cxbxb, H, W) to a tensor of shape (C, bxH, bxW), where b is the block_size.

depth_to_space(operand, block_size, name='')

Parameters

operand

Input tensor, with dimensions .

block_size
<xref:int>

Integer value. This defines the size of the spatial block where the depth elements move to. Number of channels, C, in the input tensor must be divisible by math:(block_size times block_size)

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

Examples


>>> x = np.array(np.reshape(range(8), (8, 1, 1)), dtype=np.float32)
>>> x = np.tile(x, (1, 2, 3))
>>> a = C.input_variable((8, 2, 3))
>>> d2s_op = C.depth_to_space(a, block_size=2)
>>> d2s_op.eval({a:x})
array([[[[ 0.,  2.,  0.,  2.,  0.,  2.],
         [ 4.,  6.,  4.,  6.,  4.,  6.],
         [ 0.,  2.,  0.,  2.,  0.,  2.],
         [ 4.,  6.,  4.,  6.,  4.,  6.]],
<BLANKLINE>
        [[ 1.,  3.,  1.,  3.,  1.,  3.],
         [ 5.,  7.,  5.,  7.,  5.,  7.],
         [ 1.,  3.,  1.,  3.,  1.,  3.],
         [ 5.,  7.,  5.,  7.,  5.,  7.]]]], dtype=float32)
See also

[1] W. Shi et. al. : Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network.

dropout

Each element of the input is independently set to 0 with probability dropout_rate or to 1 / (1 - dropout_rate) times its original value (with probability 1-dropout_rate). Dropout is a good way to reduce overfitting.

This behavior only happens during training. During inference dropout is a no-op. In the paper that introduced dropout it was suggested to scale the weights during inference. In CNTK's implementation, because the values that are not set to 0 are multiplied with (1 / (1 - dropout_rate)), this is not necessary.

dropout(x, dropout_rate=0.0, seed=4294967293, name='')

Parameters

x

input tensor

dropout_rate
<xref:float>, [<xref:0,1>)

probability that an element of x will be set to zero

seed
<xref:int>

random seed.

name
<xref:cntk.ops.str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = [[10, 20],[30, 40],[50, 60]]
>>> C.dropout(data, 0.5).eval() # doctest: +SKIP
array([[  0.,  40.],
       [  0.,  80.],
       [  0.,   0.]], dtype=float32)

>>> C.dropout(data, 0.75).eval() # doctest: +SKIP
array([[   0.,    0.],
       [   0.,  160.],
       [   0.,  240.]], dtype=float32)

element_and

Computes the element-wise logic AND of x.

element_and(x, y, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_and([1, 1, 0, 0], [1, 0, 1, 0]).eval()
array([ 1.,  0.,  0.,  0.], dtype=float32)

element_divide

The output of this operation is the element-wise division of the two input tensors. It supports broadcasting.

element_divide(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_divide([1., 1., 1., 1.], [0.5, 0.25, 0.125, 0.]).eval()
array([ 2.,  4.,  8.,  0.], dtype=float32)

>>> C.element_divide([5., 10., 15., 30.], [2.]).eval()
array([  2.5,   5. ,   7.5,  15. ], dtype=float32)

element_max

The output of this operation is the element-wise max of the two or more input tensors. It supports broadcasting.

element_max(left, right, name='')

Parameters

arg1

left side tensor

arg2

right side tensor

*more_args

additional inputs

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

element_min

The output of this operation is the element-wise min of the two or more input tensors. It supports broadcasting.

element_min(left, right, name='')

Parameters

arg1

left side tensor

arg2

right side tensor

*more_args

additional inputs

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

element_not

Computes the element-wise logic NOT of x and y.

element_not(x, name='')

Parameters

y
<xref:x,>

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_not([1, 1, 0, 0]).eval()
array([ 0.,  0.,  1.,  1.], dtype=float32)

element_or

Computes the element-wise logic OR of x and y.

element_or(x, y, name='')

Parameters

y
<xref:x,>

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_or([1, 1, 0, 0], [1, 0, 1, 0]).eval()
array([ 1.,  1.,  1.,  0.], dtype=float32)

element_select

return either value_if_true or value_if_false based on the value of flag. If flag != 0 value_if_true is returned, otherwise value_if_false. Behaves analogously to numpy.where(...).

element_select(flag, value_if_true, value_if_false, name='')

Parameters

flag

condition tensor

value_if_true

true branch tensor

value_if_false

false branch tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_select([-10, -1, 0, 0.3, 100], [1, 10, 100, 1000, 10000], [ 2, 20, 200, 2000, 20000]).eval()
array([     1.,     10.,    200.,   1000.,  10000.], dtype=float32)

element_times

The output of this operation is the element-wise product of the two or more input tensors. It supports broadcasting.

element_times(left, right, name='')

Parameters

arg1

left side tensor

arg2

right side tensor

*more_args

additional inputs

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_times([1., 1., 1., 1.], [0.5, 0.25, 0.125, 0.]).eval()
array([ 0.5  ,  0.25 ,  0.125,  0.   ], dtype=float32)

>>> C.element_times([5., 10., 15., 30.], [2.]).eval()
array([ 10.,  20.,  30.,  60.], dtype=float32)

>>> C.element_times([5., 10., 15., 30.], [2.], [1., 2., 1., 2.]).eval()
array([  10.,   40.,   30.,  120.], dtype=float32)

element_xor

Computes the element-wise logic XOR of x and y.

element_xor(x, y, name='')

Parameters

y
<xref:x,>

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.element_xor([1, 1, 0, 0], [1, 0, 1, 0]).eval()
array([ 0.,  1.,  1.,  0.], dtype=float32)

elu

Exponential linear unit operation. Computes the element-wise exponential linear of x: max(x, 0) for x >= 0 and x: alpha * (exp(x)-1) otherwise.

The output tensor has the same shape as x.

elu(x, alpha=1.0, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> C.elu([[-1, -0.5, 0, 1, 2]]).eval()
array([[-0.632121, -0.393469,  0.      ,  1.      ,  2.      ]], dtype=float32)

equal

Elementwise 'equal' comparison of two tensors. Result is 1 if values are equal 0 otherwise.

equal(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.equal([41., 42., 43.], [42., 42., 42.]).eval()
array([ 0.,  1.,  0.], dtype=float32)

>>> C.equal([-1,0,1], [1]).eval()
array([ 0.,  0.,  1.], dtype=float32)

exp

Computes the element-wise exponential of x:

exp(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.exp([0., 1.]).eval()
array([ 1.      ,  2.718282], dtype=float32)

expand_dims

Adds a singleton (size 1) axis at position axis.

expand_dims(x, axis, name='')

Parameters

x

input tensor

axis

The position to insert the singleton axis.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x0 = np.arange(12).reshape((2, 2, 3)).astype('f')
>>> x = C.input_variable((2, 3))
>>> C.expand_dims(x, 0).eval({x: x0})
array([[[[  0.,   1.,   2.]],
<BLANKLINE>
        [[  3.,   4.,   5.]]],
<BLANKLINE>
<BLANKLINE>
       [[[  6.,   7.,   8.]],
<BLANKLINE>
        [[  9.,  10.,  11.]]]], dtype=float32)

eye_like

Creates a matrix with diagonal set to 1s and of the same shape and the same dynamic axes as x. To be a matrix, x must have exactly two axes (counting both dynamic and static axes).

eye_like(x, sparse_output=True, name='')

Parameters

x

numpy array or any Function that outputs a tensor of rank 2

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x0 = np.arange(12).reshape((3, 4)).astype('f')
>>> x = C.input_variable(4)
>>> C.eye_like(x).eval({x: x0}).toarray()
array([[ 1.,  0.,  0.,  0.],
        [ 0.,  1.,  0.,  0.],
        [ 0.,  0.,  1.,  0.]], dtype=float32)

flatten

Flattens the input tensor into a 2D matrix. If the input tensor has shape (d_0, d_1, ... d_n) then the output will have shape (d_0 X d_1 ... d_(axis-1), d_axis X d_(axis+1) ... X dn).

flatten(x, axis=None, name='')

Parameters

x

Input tensor.

axis
<xref:int>

(Default to 0) Indicates up to which input dimensions (exclusive) should be flattened to the outer dimension of the output

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> # create 2x3x4 matrix, flatten the matrix at axis = 1
>>> shape = (2, 3, 4)
>>> data = np.reshape(np.arange(np.prod(shape), dtype = np.float32), shape)
>>> C.flatten(data, 1).eval()
array([[  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,
         11.],
       [ 12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,  22.,
         23.]], dtype=float32)

floor

The output of this operation is the element wise value rounded to the largest integer less than or equal to the input.

floor(arg, name='')

Parameters

arg

input tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network (optional)

Returns

cntk.ops.functions.Function

Examples


>>> C.floor([0.2, 1.3, 4., 5.5, 0.0]).eval()
array([ 0.,  1.,  4.,  5.,  0.], dtype=float32)

>>> C.floor([[0.6, 3.3], [1.9, 5.6]]).eval()
array([[ 0.,  3.],
       [ 1.,  5.]], dtype=float32)

>>> C.floor([-5.5, -4.2, -3., -0.7, 0]).eval()
array([-6., -5., -3., -1.,  0.], dtype=float32)

>>> C.floor([[-0.6, -4.3], [1.9, -3.2]]).eval()
array([[-1., -5.],
       [ 1., -4.]], dtype=float32)

forward_backward

Criterion node for training methods that rely on forward-backward Viterbi-like passes, e.g. Connectionist Temporal Classification (CTC) training The node takes as the input the graph of labels, produced by the labels_to_graph operation that determines the exact forward/backward procedure.

This op requires that both graph and features have the same dynamic sequence axis (i.e. same sequence length).

Labels feed into cntk.labels_to_graph and hence to forward_backward are represented as 1-hot vectors for each frame. The 1-hot vectors may have either value 1 or 2 at the position of the phone corresponding to the frame, where the value 1 means the frame is within phone boundary (i.e. not the first frame of the phone) and 2 means the frame is the phone boundary (i.e. is the first frame of the phone). If you are using HTKMLFDeserializer, this encoding will be automatically done.

For cases where labels are not frame-aligned (i.e. sequence length of labels is shorter than the sequence length of features), you can pad the labels by duplicating the last label and setting the value as 1 until the sequence length is equal to features.

Alternatively, you can also generate the labels to have uniform (equal) distribution of the labels across the feature frames (keeping in mind to set the value in the one hot encoding appropriately (1 or 2 depending on whether label is the first frame or not).

forward_backward(graph, features, blankTokenId, delayConstraint=-1, name='')

Parameters

graph

labels graph

features

network output

blankTokenId

id of the CTC blank label

delayConstraint

label output delay constraint introduced during training that allows to have shorter delay during inference. This is using the original time information to enforce that CTC tokens only get aligned within a time margin. Setting this parameter smaller will result in shorted delay between label output during decoding, yet may hurt accuracy. delayConstraint=-1 means no constraint

Returns

cntk.ops.functions.Function

Examples

Padding labels by duplicating the last labelfrom sklearn.preprocessing import LabelBinarizerlb = LabelBinarizer(pos_label=2).fit(range(6))

labels = lb.transform([0, 2, 0, 1, 3, 4]) # blank is the label 5# labels = [[2,0,0,...,0,0], [0,0,2,...,0,0], ..., [0,0,0,...,2,0]]# Retrieve the input's sequence length sequence_dim = input.shape[-2] expanded_labels = np.zeros((sequence_dim, labels.shape[-1]))# We first copy the original one-hot labels expanded_labels[:len(labels)] = labels# Then, we replicate the last label as one-hot-1 encoded expanded_labels[len(labels):, labels[-1].argmax()] = 1# expanded_labels = [[2,0,0,...,0,0], [0,0,2,...,0,0], ...,

[0,0,0,...,2,0], [0,0,0,...,1,0], ...,

[0,0,0,...,1,0]]# We can define the model and the variables

input_var = sequence.input_variable((input.shape[-1]), name='input') labels_var = sequence.input_variable((6), name='label')# The model should be defined here

model = ...# Now, we can use the forward-backward algorithm

labels_graph = cntk.labels_to_graph(labels_var) network_out = model(input_var) fb = forward_backward(labels_graph, network_out, 5) fb.eval({'input': input.astype(np.float32), 'label': expanded_labels.astype(np.float32)})

gather

Retrieves the elements of indices in the tensor reference.

gather(reference, indices, axis=None, name='')

Parameters

reference

A tensor of values

indices

A tensor of indices

axis

The axis along which the indices refer to. Default (None) means the first axis. Only one axis is supported;

only static axis is supported.
<xref:and>
name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> c = np.asarray([[[0],[1]],[[4],[5]]]).astype('f')
>>> x = C.input_variable((2,1))
>>> d = np.arange(12).reshape(6,2).astype('f')
>>> y = C.constant(d)
>>> C.gather(y, x).eval({x:c})
array([[[[  0.,   1.]],
<BLANKLINE>
        [[  2.,   3.]]],
<BLANKLINE>
<BLANKLINE>
       [[[  8.,   9.]],
<BLANKLINE>
        [[ 10.,  11.]]]], dtype=float32)

greater

Elementwise 'greater' comparison of two tensors. Result is 1 if left > right else 0.

greater(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.greater([41., 42., 43.], [42., 42., 42.]).eval()
array([ 0.,  0.,  1.], dtype=float32)

>>> C.greater([-1,0,1], [0]).eval()
array([ 0.,  0.,  1.], dtype=float32)

greater_equal

Elementwise 'greater equal' comparison of two tensors. Result is 1 if left >= right else 0.

greater_equal(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.greater_equal([41., 42., 43.], [42., 42., 42.]).eval()
array([ 0.,  1.,  1.], dtype=float32)

>>> C.greater_equal([-1,0,1], [0]).eval()
array([ 0.,  1.,  1.], dtype=float32)

hard_sigmoid

Computes the element-wise HardSigmoid function, y = max(0, min(1, alpha * x + beta)).

hard_sigmoid(x, alpha, beta, name='')

Parameters

x

numpy array or any Function that outputs a tensor

alpha
<xref:float>

the alpha term of the above equation.

beta
<xref:float>

the beta term of the above equation.

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> alpha = 1
>>> beta = 2
>>> C.hard_sigmoid([-2.5, -1.5, 1], alpha, beta).eval()
array([ 0. ,  0.5,  1. ], dtype=float32)

hardmax

Creates a tensor with the same shape as the input tensor, with zeros everywhere and a 1.0 where the maximum value of the input tensor is located. If the maximum value is repeated, 1.0 is placed in the first location found.

hardmax(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.hardmax([1., 1., 2., 3.]).eval()
array([ 0.,  0.,  0.,  1.], dtype=float32)

>>> C.hardmax([1., 3., 2., 3.]).eval()
array([ 0.,  1.,  0.,  0.], dtype=float32)

image_scaler

Alteration of image by scaling its individual values.

image_scaler(x, scalar, biases, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

scalar
<xref:float>

Scalar channel factor.

bias
<xref:numpy array>

Bias values for each channel.

Returns

An instance of Function

Return type

input

DEPRECATED.

It creates an input in the network: a place where data, such as features and labels, should be provided.

input(shape, dtype=<cntk.default_options.default_override_or object>, needs_gradient=False, is_sparse=False, dynamic_axes=[Axis('defaultBatchAxis')], name='')

Parameters

shape
<xref:tuple> or <xref:int>

the shape of the input tensor

dtype
<xref:np.float32> or <xref:np.float64> or <xref:np.float16>

data type. Default is np.float32.

needs_gradient
<xref:bool>, <xref:optional>

whether to back-propagates to it or not. False by default.

is_sparse
<xref:bool>, <xref:optional>

whether the variable is sparse (False by default)

dynamic_axes
<xref:list> or <xref:tuple>, <xref:default>

a list of dynamic axis (e.g., batch axis, sequence axis)

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.variables.Variable

input_variable

It creates an input in the network: a place where data, such as features and labels, should be provided.

input_variable(shape, dtype=np.float32, needs_gradient=False, is_sparse=False, dynamic_axes=[Axis.default_batch_axis()], name='')

Parameters

shape
<xref:tuple> or <xref:int>

the shape of the input tensor

dtype
<xref:np.float32> or <xref:np.float64> or <xref:np.float16>

data type. Default is np.float32.

needs_gradient
<xref:bool>, <xref:optional>

whether to back-propagates to it or not. False by default.

is_sparse
<xref:bool>, <xref:optional>

whether the variable is sparse (False by default)

dynamic_axes
<xref:list> or <xref:tuple>, <xref:default>

a list of dynamic axis (e.g., batch axis, time axis)

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.variables.Variable

labels_to_graph

Conversion node from labels to graph. Typically used as an input to ForwardBackward node. This node's objective is to transform input labels into a graph representing exact forward-backward criterion.

labels_to_graph(labels, name='')

Parameters

labels

input training labels

Returns

cntk.ops.functions.Function

Examples


>>> num_classes = 2
>>> labels = C.input_variable((num_classes))
>>> graph = C.labels_to_graph(labels)

leaky_relu

Leaky Rectified linear operation. Computes the element-wise leaky rectified linear of x: max(x, 0) for x >= 0 and x: alpha*x otherwise.

The output tensor has the same shape as x.

leaky_relu(x, alpha=0.01, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

alpha
<xref:float>

the alpha term of the above equation.

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> C.leaky_relu([[-1, -0.5, 0, 1, 2]]).eval()
array([[-0.01 , -0.005,  0.   ,  1.   ,  2.   ]], dtype=float32)

less

Elementwise 'less' comparison of two tensors. Result is 1 if left < right else 0.

less(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.less([41., 42., 43.], [42., 42., 42.]).eval()
array([ 1.,  0.,  0.], dtype=float32)

>>> C.less([-1,0,1], [0]).eval()
array([ 1.,  0.,  0.], dtype=float32)

less_equal

Elementwise 'less equal' comparison of two tensors. Result is 1 if left <= right else 0.

less_equal(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.less_equal([41., 42., 43.], [42., 42., 42.]).eval()
array([ 1.,  1.,  0.], dtype=float32)

>>> C.less_equal([-1,0,1], [0]).eval()
array([ 1.,  1.,  0.], dtype=float32)

local_response_normalization

Local Response Normalization layer. See Section 3.3 of the paper:

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

The mathematical equation is:

b_{x,y}^i=a_{x,y}^i/(bias+\alpha\sum_{j=max(0,i-depth_radius)}^{min(N-1, i+depth_radius)}(a_{x,y}^j)^2)^\beta

where a_{x,y}^i is the activity of a neuron computed by applying kernel i at position (x,y) N is the total number of kernels, depth_radius is half normalization width.

local_response_normalization(operand, depth_radius, bias, alpha, beta, name='')

Parameters

operand

input of the Local Response Normalization.

depth_radius
<xref:int>

the radius on the channel dimension to apply the normalization.

bias
<xref:double>

a bias term to avoid divide by zero.

alpha
<xref:double>

the alpha term of the above equation.

beta
<xref:double>

the beta term of the above equation.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network.

Returns

cntk.ops.functions.Function

log

Computes the element-wise the natural logarithm of x:

Note

CNTK returns -85.1 for log(x) if x is negative or zero. The reason is that

it uses 1e-37 (whose natural logarithm is -85.1) as the smallest float

number for log, because this is the only guaranteed precision across

platforms. This will be changed to return NaN and -inf.

log(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

Examples


>>> C.log([1., 2.]).eval()
array([ 0.      ,  0.693147], dtype=float32)

log_add_exp

Calculates the log of the sum of the exponentials of the two or more input tensors. It supports broadcasting.

log_add_exp(left, right, name='')

Parameters

arg1

left side tensor

arg2

right side tensor

*more_args

additional inputs

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> a = np.arange(3,dtype=np.float32)
>>> np.exp(C.log_add_exp(np.log(1+a), np.log(1+a*a)).eval())
array([ 2.,  4.,  8.], dtype=float32)
>>> np.exp(C.log_add_exp(np.log(1+a), [0.]).eval())
array([ 2.,  3.,  4.], dtype=float32)

log_softmax

Computes the logsoftmax normalized values of x. That is, y = x - log(reduce_sum(exp(x), axis)) (the implementation uses an equivalent formula for numerical stability).

It is also possible to use x - reduce_log_sum_exp(x, axis) instead of log_softmax: this can be faster (one reduce pass instead of two), but can behave slightly differently numerically.

log_softmax(x, axis=None, name='')

Parameters

x

numpy array or any Function that outputs a tensor

axis
<xref:int>

axis along which the logsoftmax operation will be performed (the default is the last axis)

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

mean

Create a new Function instance that computes element-wise mean of input tensors.

mean(*operands, **kw_name)

Parameters

operands
<xref:list>

list of functions

name
<xref:str>, <xref:optional>

the name of the mean Function in the network

Returns

cntk.ops.functions.Function

Examples


>>> in1 = C.input_variable((4,))
>>> in2 = C.input_variable((4,))
>>> model = C.mean([in1, in2])
>>> in1_data = np.asarray([[1., 2., 3., 4.]], np.float32)
>>> in2_data = np.asarray([[0., 5., -3., 2.]], np.float32)
>>> model.eval({in1: in1_data, in2: in2_data})
array([[ 0.5,  3.5,  0. ,  3. ]], dtype=float32)

mean_variance_normalization

Computes mean-variance normalization of the specified input operand.

This operation computes and mean and variance for the entire tensor if use_stats_across_channels is True. If use_stats_across_channels is False the computes mean and variance per channel and normalizes each channel with its own mean and variance. If do_variance_scaling is False, only the mean is subtracted, and the variance scaling is omitted.

mean_variance_normalization(operand, epsilon=1e-05, use_stats_across_channels=False, do_variance_scaling=True, name='')

Parameters

operand

Input tensor, with dimensions .

epsilon
<xref:double>, <xref:default 0.00001>

epsilon added to the standard deviation to avoid division by 0.

use_stats_across_channels
<xref:bool>

If False, mean and variance are computed per channel. If True, mean and variance are computed over the entire tensor (all axes).

do_variance_scaling
<xref:bool>

If False, only the mean is subtracted. If True, it is also scaled by inverse of standard deviation.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = np.array([[[0., 2], [4., 6.]], [[0., 4], [8., 12.]]]).astype(np.float32)
>>> data
array([[[  0.,   2.],
        [  4.,   6.]],
<BLANKLINE>
       [[  0.,   4.],
        [  8.,  12.]]], dtype=float32)
>>> saved_precision = np.get_printoptions()['precision']
>>> np.set_printoptions(precision=4) # For consistent display upto 4 decimals.
>>> C.mean_variance_normalization(data).eval()
array([[[-1.3416, -0.4472],
        [ 0.4472,  1.3416]],
<BLANKLINE>
       [[-1.3416, -0.4472],
        [ 0.4472,  1.3416]]], dtype=float32)
>>> np.set_printoptions(precision=saved_precision) # Reseting the display precision.

minus

The output of this operation is left minus right tensor. It supports broadcasting.

minus(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.minus([1, 2, 3], [4, 5, 6]).eval()
array([-3., -3., -3.], dtype=float32)

>>> C.minus([[1,2],[3,4]], 1).eval()
array([[ 0.,  1.],
       [ 2.,  3.]], dtype=float32)

negate

Computes the element-wise negation of x:

negate(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.negate([-1, 1, -2, 3]).eval()
array([ 1., -1.,  2., -3.], dtype=float32)

not_equal

Elementwise 'not equal' comparison of two tensors. Result is 1 if left != right else 0.

not_equal(left, right, name='')

Parameters

left

left side tensor

right

right side tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.not_equal([41., 42., 43.], [42., 42., 42.]).eval()
array([ 1.,  0.,  1.], dtype=float32)

>>> C.not_equal([-1,0,1], [0]).eval()
array([ 1.,  0.,  1.], dtype=float32)

one_hot

Create one hot tensor based on the input tensor

one_hot(x, num_classes, sparse_output=False, axis=-1, name='')

Parameters

x

input tensor, the value must be positive integer and less than num_class

num_classes

the number of class in one hot tensor

sparse_output

if set as True, we will create the one hot tensor as sparse.

axis

The axis to fill (default: -1, a new inner-most axis).

name
<xref:str>, <xref:optional>, <xref:keyword only>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = np.asarray([[1, 2],
...                    [4, 5]], dtype=np.float32)

>>> x = C.input_variable((2,))
>>> C.one_hot(x, 6, False).eval({x:data})
array([[[ 0.,  1.,  0.,  0.,  0.,  0.],
        [ 0.,  0.,  1.,  0.,  0.,  0.]],
<BLANKLINE>
        [[ 0.,  0.,  0.,  0.,  1.,  0.],
         [ 0.,  0.,  0.,  0.,  0.,  1.]]], dtype=float32)

ones_like

Creates an all-ones tensor with the same shape and dynamic axes as x:

ones_like(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x0 = np.arange(24).reshape((2, 3, 4)).astype('f')
>>> x = C.input_variable((3, 4))
>>> C.ones_like(x).eval({x: x0})
array([[[ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]],
<BLANKLINE>
       [[ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]]], dtype=float32)

optimized_rnnstack

An RNN implementation that uses the primitives in cuDNN. If cuDNN is not available it fails. You can use convert_optimized_rnnstack to convert a model to GEMM-based implementation when no cuDNN.

optimized_rnnstack(operand, weights, hidden_size, num_layers, bidirectional=False, recurrent_op='lstm', name='')

Returns

cntk.ops.functions.Function

Examples


>>> from _cntk_py import constant_initializer
>>> W = C.parameter((C.InferredDimension,4), constant_initializer(0.1))
>>> x = C.input_variable(shape=(4,))
>>> s = np.reshape(np.arange(20.0, dtype=np.float32), (5,4))
>>> t = np.reshape(np.arange(12.0, dtype=np.float32), (3,4))
>>> f = C.optimized_rnnstack(x, W, 8, 2) # doctest: +SKIP
>>> r = f.eval({x:[s,t]})                # doctest: +SKIP
>>> len(r)                               # doctest: +SKIP
2
>>> print(*r[0].shape)                   # doctest: +SKIP
5 8
>>> print(*r[1].shape)                   # doctest: +SKIP
3 8
>>> r[0][:3,:]-r[1]                      # doctest: +SKIP
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]], dtype=float32)

output_variable

It creates an output variable that is used to define a user defined function.

output_variable(shape, dtype, dynamic_axes, needs_gradient=True, name='')

Parameters

shape
<xref:tuple> or <xref:int>

the shape of the input tensor

dtype
<xref:np.float32> or <xref:np.float64> or <xref:np.float16>

data type

dynamic_axes
<xref:list> or <xref:tuple>

a list of dynamic axis (e.g., batch axis, time axis)

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Variable that is of output type

pad

Pads a tensor according to the specified patterns. Three padding modes are supported: CONSTANT / REFLECT / SYMMETRIC.

pad(x, pattern, mode=0, constant_value=0, name='')

Parameters

x

tensor to be padded.

pattern
<xref:list of tuple with 2 integers>

how many values to add before and after the contents of the tensor in each dimension.

mode
<xref:int>

padding mode: C.ops.CONSTANT_PAD, C.ops.REFLECT_PAD and C.ops.SYMMETRIC_PAD

constant_value

the value used to fill the padding cells, only meaningful under CONSTANT mode.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = np.arange(6, dtype=np.float32).reshape((2,3))
>>> x = C.constant(value=data)
>>> C.pad(x, pattern=[(1,1),(2,2)], mode=C.ops.CONSTANT_PAD, constant_value=1).eval()
array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  0.,  1.,  2.,  1.,  1.],
       [ 1.,  1.,  3.,  4.,  5.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.]], dtype=float32)
>>> C.pad(x, pattern=[(1,1),(2,2)], mode=C.ops.REFLECT_PAD).eval()
array([[ 5.,  4.,  3.,  4.,  5.,  4.,  3.],
       [ 2.,  1.,  0.,  1.,  2.,  1.,  0.],
       [ 5.,  4.,  3.,  4.,  5.,  4.,  3.],
       [ 2.,  1.,  0.,  1.,  2.,  1.,  0.]], dtype=float32)
>>> C.pad(x, pattern=[(1,1),(2,2)], mode=C.ops.SYMMETRIC_PAD).eval()
array([[ 1.,  0.,  0.,  1.,  2.,  2.,  1.],
       [ 1.,  0.,  0.,  1.,  2.,  2.,  1.],
       [ 4.,  3.,  3.,  4.,  5.,  5.,  4.],
       [ 4.,  3.,  3.,  4.,  5.,  5.,  4.]], dtype=float32)

param_relu

Parametric rectified linear operation. Computes the element-wise parameteric rectified linear of x: max(x, 0) for x >= 0 and x: alpha*x otherwise.

The output tensor has the same shape as x.

param_relu(alpha, x, name='')

Parameters

alpha
Parameter

same shape as x

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> alpha = C.constant(value=[[0.5, 0.5, 0.5, 0.5, 0.5]])
>>> C.param_relu(alpha, [[-1, -0.5, 0, 1, 2]]).eval()
array([[-0.5 , -0.25,  0.  ,  1.  ,  2.  ]], dtype=float32)

parameter

It creates a parameter tensor.

parameter(shape=None, init=None, dtype=None, device=None, name='')

Parameters

shape
<xref:tuple> or <xref:int>, <xref:optional>

the shape of the input tensor. If not provided, it will be inferred from value.

init
<xref:scalar> or <xref:NumPy array> or <xref:initializer>

if init is a scalar it will be replicated for every element in the tensor or NumPy array. If it is the output of an initializer form cntk it will be used to initialize the tensor at the first forward pass. If None, the tensor will be initialized with 0.

dtype
<xref:optional>

data type of the constant. If a NumPy array and dtype, are given, then data will be converted if needed. If none given, it will default to np.float32.

device
DeviceDescriptor

instance of DeviceDescriptor

name
<xref:str>, <xref:optional>

the name of the Parameter instance in the network

Returns

cntk.variables.Parameter

Examples


>>> init_parameter = C.parameter(shape=(3,4), init=2)
>>> np.asarray(init_parameter) # doctest: +SKIP
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]], dtype=float32)

per_dim_mean_variance_normalize

Computes per dimension mean-variance normalization of the specified input operand.

per_dim_mean_variance_normalize(operand, mean, inv_stddev, name='')

Parameters

operand

the variable to be normalized

mean
<xref:NumPy array>

per dimension mean to use for the normalization

inv_stddev
<xref:NumPy array>

per dimension standard deviation to use for the normalization

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

placeholder

It creates a placeholder variable that has to be later bound to an actual variable. A common use of this is to serve as a placeholder for a later output variable in a recurrent network, which is replaced with the actual output variable by calling replace_placeholder(s).

placeholder(shape=None, dynamic_axes=None, name='')

Parameters

shape
<xref:tuple> or <xref:int>

the shape of the variable tensor

dynamic_axes
<xref:list>

the list of dynamic axes that the actual variable uses

name
<xref:str>, <xref:optional>

the name of the placeholder variable in the network

Returns

cntk.variables.Variable

plus

The output of this operation is the sum of the two or more input tensors. It supports broadcasting.

plus(left, right, name='')

Parameters

arg1

left side tensor

arg2

right side tensor

*more_args

additional inputs

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.plus([1, 2, 3], [4, 5, 6]).eval()
array([ 5.,  7.,  9.], dtype=float32)

>>> C.plus([-5, -4, -3, -2, -1], [10]).eval()
array([ 5.,  6.,  7.,  8.,  9.], dtype=float32)

>>> C.plus([-5, -4, -3, -2, -1], [10], [3, 2, 3, 2, 3], [-13], [+42], 'multi_arg_example').eval()
array([ 37.,  37.,  39.,  39.,  41.], dtype=float32)

>>> C.plus([-5, -4, -3, -2, -1], [10], [3, 2, 3, 2, 3]).eval()
array([  8.,   8.,  10.,  10.,  12.], dtype=float32)

pooling

The pooling operations compute a new tensor by selecting the maximum or average value in the pooling input. In the case of average pooling with padding, the average is only over the valid region.

N-dimensional pooling allows to create max or average pooling of any dimensions, stride or padding.

pooling(operand, pooling_type, pooling_window_shape, strides=(1,), auto_padding=[False], ceil_out_dim=False, include_pad=False, name='')

Parameters

operand

pooling input

pooling_type

one of <xref:cntk.ops.MAX_POOLING> or <xref:cntk.ops.AVG_POOLING>

pooling_window_shape

dimensions of the pooling window

strides
<xref:default 1>

strides.

auto_padding
<xref:default >[<xref:False,>]

automatic padding flags for each input dimension.

ceil_out_dim
<xref:default False>

ceiling while computing output size

include_pad
<xref:default False>

include pad while average pooling

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> img = np.reshape(np.arange(16, dtype = np.float32), [1, 4, 4])
>>> x = C.input_variable(img.shape)
>>> C.pooling(x, C.AVG_POOLING, (2,2), (2,2)).eval({x : [img]})
array([[[[  2.5,   4.5],
          [ 10.5,  12.5]]]], dtype=float32)
>>> C.pooling(x, C.MAX_POOLING, (2,2), (2,2)).eval({x : [img]})
array([[[[  5.,   7.],
          [ 13.,  15.]]]], dtype=float32)

pow

Computes base raised to the power of exponent. It supports broadcasting. This is well defined if base is non-negative or exponent is an integer. Otherwise the result is NaN. The gradient with respect to the base is well defined if the forward operation is well defined. The gradient with respect to the exponent is well defined if the base is non-negative, and it is set to 0 otherwise.

pow(base, exponent, name='')

Parameters

base

base tensor

exponent

exponent tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.pow([1, 2, -2], [3, -2, 3]).eval()
array([ 1.  ,  0.25, -8.  ], dtype=float32)

>>> C.pow([[0.5, 2],[4, 1]], -2).eval()
array([[ 4.    ,  0.25  ],
       [ 0.0625,  1.    ]], dtype=float32)

random_sample

Estimates inclusion frequencies for random sampling with or without replacement.

The output value is a set of num_samples random samples represented by a (sparse) matrix of shape [num_samples x len(weights)], where len(weights) is the number of classes (categories) to choose from. The output has no dynamic axis. The samples are drawn according to the weight vector p(i) = weights[i] / sum(weights) We get one set of samples per minibatch. Intended use cases are e.g. sampled softmax, noise contrastive estimation etc.

random_sample(weights, num_samples, allow_duplicates, seed=4294967293, name='')

Parameters

weights

input vector of sampling weights which should be non-negative numbers.

num_samples
<xref:int>

number of expected samples

allow_duplicates
<xref:bool>

If sampling is done with replacement (True) or without (False).

seed
<xref:int>

random seed.

name
<xref:cntk.ops.str>, <xref:optional>

the name of the Function instance in the network.

Returns

cntk.ops.functions.Function

random_sample_inclusion_frequency

For weighted sampling with the specified sample size (num_samples) this operation computes the expected number of occurrences of each class in the sampled set. In case of sampling without replacement the result is only an estimate which might be quite rough in the case of small sample sizes. Intended uses are e.g. sampled softmax, noise contrastive estimation etc. This operation will be typically used together with random_sample.

random_sample_inclusion_frequency(weights, num_samples, allow_duplicates, seed=4294967293, name='')

Returns

cntk.ops.functions.Function

Examples


>>> import numpy as np
>>> from cntk import *
>>> # weight vector with 100 '1000'-values followed
>>> # by 100 '1' values
>>> w1 = np.full((100),1000, dtype = np.float)
>>> w2 = np.full((100),1, dtype = np.float)
>>> w = np.concatenate((w1, w2))
>>> f = random_sample_inclusion_frequency(w, 150, True).eval()
>>> f[0]
1.4985015
>>> f[1]
1.4985015
>>> f[110]
0.0014985015
>>> # when switching to sampling without duplicates samples are
>>> # forced to pick the low weight classes too
>>> f = random_sample_inclusion_frequency(w, 150, False).eval()
>>> f[0]
1.0

reciprocal

Computes the element-wise reciprocal of x:

reciprocal(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.reciprocal([-1/3, 1/5, -2, 3]).eval()
array([-3.      ,  5.      , -0.5     ,  0.333333], dtype=float32)

reconcile_dynamic_axes

Create a new Function instance which reconciles the dynamic axes of the specified tensor operands. The output of the returned Function has the sample layout of the 'x' operand and the dynamic axes of the 'dynamic_axes_as' operand. This operator also performs a runtime check to ensure that the dynamic axes layouts of the 2 operands indeed match.

reconcile_dynamic_axes(x, dynamic_axes_as, name='')

Parameters

x

The Function/Variable, whose dynamic axes are to be reconciled

dynamic_axes_as

The Function/Variable, to whose dynamic axes the operand 'x''s dynamic axes are reconciled to.

name
<xref:str>, <xref:optional>

the name of the reconcile_dynamic_axes Function in the network

Returns

cntk.ops.functions.Function

reduce_l1

Computes the L1 norm of the input tensor's element along the provided axes. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned.

reduce_l1(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.reduce_l1([[[1,2], [3,4]],[[5,6], [7,8]],[[9,10], [11,12]]], 2, False).eval()
array([[  3.,   7.],
       [ 11.,  15.],
       [ 19.,  23.]], dtype=float32)

reduce_l2

Computes the L2 norm of the input tensor's element along the provided axes. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned.

reduce_l2(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.reduce_l2([[[1,2], [3,4]]], 2).eval()
array([[[ 2.236068],
        [ 5.        ]]], dtype=float32)

reduce_log_sum_exp

Computes the log of the sum of the exponentiations of the input tensor's elements across a specified axis or a list of specified axes.

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

reduce_log_sum_exp(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

Function

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_log_sum_exp(data, axis=0).eval().round(4)
array([[[ 55.      ,   2.0986],
        [ 60.      ,   3.0986]]], dtype=float32)
>>> np.log(np.sum(np.exp(data), axis=0)).round(4)
array([[ 55.      ,   2.0986],
       [ 60.      ,   3.0986]], dtype=float32)
>>> C.reduce_log_sum_exp(data, axis=(0,2)).eval().round(4)
array([[[ 55.],
        [ 60.]]], dtype=float32)
>>> np.log(np.sum(np.exp(data), axis=(0,2))).round(4)
array([ 55.,  60.], dtype=float32)

>>> x = C.input_variable(shape=(2,2))
>>> lse = C.reduce_log_sum_exp(x, axis=[C.axis.Axis.default_batch_axis(), 1])
>>> lse.eval({x:data}).round(4)
array([[ 55.],
       [ 60.]], dtype=float32)
>>> np.log(np.sum(np.exp(data), axis=(0,2))).round(4)
array([ 55.,  60.], dtype=float32)

reduce_max

Computes the max of the input tensor's elements across a specified axis or a list of specified axes.

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

reduce_max(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

Function

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_max(data, 0).eval().round(4)
array([[[ 55.,   1.],
        [ 60.,   2.]]], dtype=float32)
>>> C.reduce_max(data, 1).eval().round(4)
array([[[ 20.,   2.]],
<BLANKLINE>
       [[ 40.,   2.]],
<BLANKLINE>
       [[ 60.,   2.]]], dtype=float32)
>>> C.reduce_max(data, (0,2)).eval().round(4)
array([[[ 55.],
        [ 60.]]], dtype=float32)

>>> x = C.input_variable((2,2))
>>> C.reduce_max( x * 1.0, (C.Axis.default_batch_axis(), 1)).eval({x: data}).round(4)
array([[ 55.],
       [ 60.]], dtype=float32)

reduce_mean

Computes the mean of the input tensor's elements across a specified axis or a list of specified axes.

reduce_mean(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_mean(data, 0).eval().round(4)
array([[[ 30.,   1.],
        [ 40.,   2.]]], dtype=float32)
>>> np.mean(data, axis=0).round(4)
array([[ 30.,   1.],
       [ 40.,   2.]], dtype=float32)
>>> C.reduce_mean(data, 1).eval().round(4)
array([[[ 12.5,   1.5]],
<BLANKLINE>
       [[ 35. ,   1.5]],
<BLANKLINE>
       [[ 57.5,   1.5]]], dtype=float32)
>>> np.mean(data, axis=1).round(4)
array([[ 12.5,   1.5],
       [ 35. ,   1.5],
       [ 57.5,   1.5]], dtype=float32)
>>> C.reduce_mean(data, (0,2)).eval().round(4)
array([[[ 15.5],
        [ 21. ]]], dtype=float32)

>>> x = C.input_variable((2,2))
>>> C.reduce_mean( x * 1.0, (C.Axis.default_batch_axis(), 1)).eval({x: data}).round(4)
array([[ 15.5],
       [ 21.      ]], dtype=float32)

reduce_min

Computes the min of the input tensor's elements across a specified axis or a list of specified axes.

reduce_min(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list of integers> or <xref:a list of cntk.axis.Axis>

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_min(data, 0).eval().round(4)
array([[[  5.,   1.],
        [ 20.,   2.]]], dtype=float32)
>>> C.reduce_min(data, 1).eval().round(4)
array([[[  5.,   1.]],
<BLANKLINE>
       [[ 30.,   1.]],
<BLANKLINE>
       [[ 55.,   1.]]], dtype=float32)
>>> C.reduce_min(data, (0,2)).eval().round(4)
array([[[ 1.],
        [ 2.]]], dtype=float32)

>>> x = C.input_variable((2,2))
>>> C.reduce_min( x * 1.0, (C.Axis.default_batch_axis(), 1)).eval({x: data}).round(4)
array([[ 1.],
       [ 2.]], dtype=float32)

reduce_prod

Computes the min of the input tensor's elements across the specified axis.

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

reduce_prod(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

Function

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_prod(data, 0).eval().round(4)
array([[[  8250.,      1.],
        [ 48000.,      8.]]], dtype=float32)
>>> C.reduce_prod(data, 1).eval().round(4)
array([[[  100.,     2.]],
<BLANKLINE>
       [[ 1200.,     2.]],
<BLANKLINE>
       [[ 3300.,     2.]]], dtype=float32)
>>> C.reduce_prod(data, (0,2)).eval().round(4)
array([[[   8250.],
        [ 384000.]]], dtype=float32)

>>> x = C.input_variable((2,2))
>>> C.reduce_prod( x * 1.0, (C.Axis.default_batch_axis(), 1)).eval({x: data}).round(4)
array([[   8250.],
       [ 384000.]], dtype=float32)

reduce_sum

Computes the sum of the input tensor's elements across one axis or a list of axes. If the axis parameter is not specified then the sum will be computed over all static axes, which is equivalent with specifying axis=Axis.all_static_axes(). If axis=Axis.all_axes() is specified, then the output is a scalar which is the sum of all the elements in the minibatch. And if axis=Axis.default_batch_axis() is specified, then the reduction will happen across the batch axis (In this case the input must not be a sequence).

Note that CNTK keeps the shape of the resulting tensors when reducing over multiple static axes.

reduce_sum(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

Examples


>>> # create 3x2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data = np.array([[[5,1], [20,2]],[[30,1], [40,2]],[[55,1], [60,2]]], dtype=np.float32)

>>> C.reduce_sum(data, 0).eval().round(4)
array([[[  90.,    3.],
        [ 120.,    6.]]], dtype=float32)
>>> np.sum(data, axis=0).round(4)
array([[  90.,    3.],
       [ 120.,    6.]], dtype=float32)
>>> C.reduce_sum(data, 1).eval().round(4)
array([[[  25.,    3.]],
<BLANKLINE>
       [[  70.,    3.]],
<BLANKLINE>
       [[ 115.,    3.]]], dtype=float32)
>>> np.sum(data, axis=1).round(4)
array([[  25.,    3.],
       [  70.,    3.],
       [ 115.,    3.]], dtype=float32)
>>> C.reduce_sum(data, (0,2)).eval().round(4)
array([[[  93.],
        [ 126.]]], dtype=float32)

reduce_sum_square

Computes the sum square of the input tensor's element along the provided axes. The resulted tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then the resulted tensor have the reduced dimension pruned.

reduce_sum_square(x, axis=None, keepdims=True, name='')

Parameters

x

input tensor

axis
<xref:int> or Axis or <xref:a list> or <xref:tuple of int> or Axis

axis along which the reduction will be performed

keepdims
<xref:boolean>

Keep the reduced dimension or not, default True mean keep reduced dimension

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.reduce_sum_square([[[1,2], [3,4]]], 2).eval()
array([[[  5.],
        [ 25.]]], dtype=float32)

relu

Rectified linear operation. Computes the element-wise rectified linear of x: max(x, 0)

The output tensor has the same shape as x.

relu(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.relu([[-1, -0.5, 0, 1, 2]]).eval()
array([[ 0.,  0.,  0.,  1.,  2.]], dtype=float32)

reshape

Reinterpret input samples as having different tensor dimensions One dimension may be specified as 0 and will be inferred

The output tensor has the shape specified by 'shape'.

reshape(x, shape, begin_axis=None, end_axis=None, name='')

Parameters

x

tensor to be reshaped

shape
<xref:tuple>

a tuple defining the resulting shape. The specified shape tuple may contain -1 for at most one axis, which is automatically inferred to the correct dimension size by dividing the total size of the sub-shape being reshaped with the product of the dimensions of all the non-inferred axes of the replacement shape.

begin_axis
<xref:int> or <xref:None>

shape replacement begins at this axis. Negative values are counting from the end. None is the same as 0. To refer to the end of the shape tuple, pass Axis.new_leading_axis().

end_axis
<xref:int> or <xref:None>

shape replacement ends at this axis (excluding this axis). Negative values are counting from the end. None refers to the end of the shape tuple.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> i1 = C.input_variable(shape=(3,2))
>>> C.reshape(i1, (2,3)).eval({i1:np.asarray([[[[0., 1.],[2., 3.],[4., 5.]]]], dtype=np.float32)})
array([[[ 0.,  1.,  2.],
         [ 3.,  4.,  5.]]], dtype=float32)

roipooling

The ROI (Region of Interest) pooling operation pools over sub-regions of an input volume and produces a fixed sized output volume regardless of the ROI size. It is used for example for object detection.

Each input image has a fixed number of regions of interest, which are specified as bounding boxes (x, y, w, h) that are relative to the image size [W x H]. This operation can be used as a replacement for the final pooling layer of an image classification network (as presented in Fast R-CNN and others).

Changed in version 2.1: The signature was updated to match the Caffe implementation: the parameters pooling_type and spatial_scale were added, and the coordinates for the parameters rois are now absolute to the original image size.

roipooling(operand, rois, pooling_type, roi_output_shape, spatial_scale, name='')

Parameters

operand

a convolutional feature map as the input volume ([W x H x C x N]).

pooling_type

only <xref:cntk.ops.MAX_POOLING>

rois

the coordinates of the ROIs per image ([4 x roisPerImage x N]), each ROI is (x1, y1, x2, y2) absolute to original image size.

roi_output_shape

dimensions (width x height) of the ROI pooling output shape

spatial_scale

the scale of operand from the original image size.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

round

The output of this operation is the element wise value rounded to the nearest integer. In case of tie, where element can have exact fractional part of 0.5 this operation follows "round half-up" tie breaking strategy. This is different from the round operation of numpy which follows round half to even.

round(arg, name='')

Parameters

arg

input tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network (optional)

Returns

cntk.ops.functions.Function

Examples


>>> C.round([0.2, 1.3, 4., 5.5, 0.0]).eval()
array([ 0.,  1.,  4.,  6.,  0.], dtype=float32)

>>> C.round([[0.6, 3.3], [1.9, 5.6]]).eval()
array([[ 1.,  3.],
       [ 2.,  6.]], dtype=float32)

>>> C.round([-5.5, -4.2, -3., -0.7, 0]).eval()
array([-5., -4., -3., -1.,  0.], dtype=float32)

>>> C.round([[-0.6, -4.3], [1.9, -3.2]]).eval()
array([[-1., -4.],
       [ 2., -3.]], dtype=float32)

selu

Scaled exponential linear unit operation. Computes the element-wise exponential linear of x: scale * x for x >= 0 and x: scale * alpha * (exp(x)-1) otherwise.

The output tensor has the same shape as x.

selu(x, scale=1.0507009873554805, alpha=1.6732632423543772, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> C.selu([[-1, -0.5, 0, 1, 2]]).eval()
array([[-1.111331, -0.691758,  0.      ,  1.050701,  2.101402]], dtype=float32)

sigmoid

Computes the element-wise sigmoid of x:

The output tensor has the same shape as x.

sigmoid(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.sigmoid([-2, -1., 0., 1., 2.]).eval()
array([ 0.119203,  0.268941,  0.5     ,  0.731059,  0.880797], dtype=float32)

sin

Computes the element-wise sine of x:

The output tensor has the same shape as x.

sin(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.sin(np.arcsin([[1,0.5],[-0.25,-0.75]])).eval(),5)
array([[ 1.  ,  0.5 ],
       [-0.25, -0.75]], dtype=float32)

sinh

Computes the element-wise sinh of x:

The output tensor has the same shape as x.

sinh(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.sinh([[1,0.5],[-0.25,-0.75]]).eval(),5)
array([[ 1.1752 ,  0.5211 ],
       [-0.25261, -0.82232]], dtype=float32)

slice

Slice the input along one or multiple axes.

slice(x, axis, begin_index, end_index, strides=None, name='')

Returns

cntk.ops.functions.Function

Examples


>>> # slice using input variable
>>> # create 2x3 matrix
>>> x1 = C.input_variable((2,3))
>>> # slice index 1 (second) at first axis
>>> C.slice(x1, 0, 1, 2).eval({x1: np.asarray([[[1,2,-3],
...                                             [4, 5, 6]]],dtype=np.float32)})
array([[[ 4.,  5.,  6.]]], dtype=float32)
<BLANKLINE>
>>> # slice index 0 (first) at second axis
>>> C.slice(x1, 1, 0, 1).eval({x1: np.asarray([[[1,2,-3],
...                                             [4, 5, 6]]],dtype=np.float32)})
array([[[ 1.],
        [ 4.]]], dtype=float32)
>>> # slice with strides
>>> C.slice(x1, 0, 0, 2, 2).eval({x1: np.asarray([[[1,2,-3],
...                                                [4, 5, 6]]],dtype=np.float32)})
array([[[ 1.,  2., -3.]]], dtype=float32)
<BLANKLINE>
>>> # reverse
>>> C.slice(x1, 0, 0, 2, -1).eval({x1: np.asarray([[[1,2,-3],
...                                                 [4, 5, 6]]],dtype=np.float32)})
array([[[ 4.,  5.,  6.],
[ 1.,  2., -3.]]], dtype=float32)
<BLANKLINE>
>>> # slice along multiple axes
>>> C.slice(x1, [0,1], [1,0], [2,1]).eval({x1: np.asarray([[[1, 2, -3],
...                                                         [4, 5, 6]]],dtype=np.float32)})
array([[[ 4.]]], dtype=float32)
<BLANKLINE>
>>> # slice using constant
>>> data = np.asarray([[1, 2, -3],
...                    [4, 5,  6]], dtype=np.float32)
>>> x = C.constant(value=data)
>>> C.slice(x, 0, 1, 2).eval()
array([[ 4.,  5.,  6.]], dtype=float32)
>>> C.slice(x, 1, 0, 1).eval()
array([[ 1.],
       [ 4.]], dtype=float32)
>>> C.slice(x, [0,1], [1,0], [2,1]).eval()
array([[ 4.]], dtype=float32)
<BLANKLINE>
>>> # slice using the index overload
>>> data = np.asarray([[1, 2, -3],
...                    [4, 5,  6]], dtype=np.float32)
>>> x = C.constant(value=data)
>>> x[0].eval()
array([[ 1.,  2.,  -3.]], dtype=float32)
>>> x[0, [1,2]].eval()
array([[ 2.,  -3.]], dtype=float32)
<BLANKLINE>
>>> x[1].eval()
array([[ 4.,  5.,  6.]], dtype=float32)
>>> x[:,:2,:].eval()
array([[ 1.,  2.],
       [ 4.,  5.]], dtype=float32)
See also

Indexing in NumPy: https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

softmax

Computes the gradient of at z = x. Concretely,

with the understanding that the implementation can use equivalent formulas for efficiency and numerical stability.

The output is a vector of non-negative numbers that sum to 1 and can therefore be interpreted as probabilities for mutually exclusive outcomes as in the case of multiclass classification.

If axis is given as integer, then the softmax will be computed along that axis. If the provided axis is -1, it will be computed along the last axis. Otherwise, softmax will be applied to all axes.

softmax(x, axis=None, name='')

Parameters

x

numpy array or any Function that outputs a tensor

axis
<xref:int> or Axis

axis along which the softmax operation will be performed

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.softmax([[1, 1, 2, 3]]).eval()
array([[ 0.082595,  0.082595,  0.224515,  0.610296]], dtype=float32)

>>> C.softmax([1, 1]).eval()
array([ 0.5,  0.5], dtype=float32)

>>> C.softmax([[[1, 1], [3, 5]]], axis=-1).eval()
array([[[ 0.5     ,  0.5     ],
        [ 0.119203,  0.880797]]], dtype=float32)

>>> C.softmax([[[1, 1], [3, 5]]], axis=1).eval()
array([[[ 0.119203,  0.017986],
        [ 0.880797,  0.982014]]], dtype=float32)

softplus

Softplus operation. Computes the element-wise softplus of x:

The optional steepness allows to make the knee sharper (steepness>1) or softer, by computing softplus(x * steepness) / steepness. (For very large steepness, this approaches a linear rectifier).

The output tensor has the same shape as x.

softplus(x, steepness=1, name='')

Parameters

x
<xref:numpy.array> or Function

any Function that outputs a tensor.

steepness
<xref:float>, <xref:optional>

optional steepness factor

name
<xref:str>, <xref:default to ''>

the name of the Function instance in the network

Returns

An instance of Function

Return type

Examples


>>> C.softplus([[-1, -0.5, 0, 1, 2]]).eval()
array([[ 0.313262,  0.474077,  0.693147,  1.313262,  2.126928]], dtype=float32)

>>> C.softplus([[-1, -0.5, 0, 1, 2]], steepness=4).eval()
array([[ 0.004537,  0.031732,  0.173287,  1.004537,  2.000084]], dtype=float32)

softsign

Computes the element-wise softsign of x:

The output tensor has the same shape as x.

softsign(x, steepness=1, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.softsign([[-1, 0, 1]]).eval()
array([[-0.5,  0. ,  0.5]], dtype=float32)

space_to_depth

Rearranges elements in the input tensor from the spatial dimensions to the depth dimension.

This is the reverse transformation of depth_to_space. This operation is useful for implementing and testing sub-pixel convolution that is part of models for image super-resolution (see [1]). It rearranges elements of an input tensor of shape (C, H, W) to a tensor of shape (Cbb, H/b, W/b), where b is the block_size, by rearranging non-overlapping spatial blocks of size block_size x block_size into the depth/channel dimension at each location.

space_to_depth(operand, block_size, name='')

Parameters

operand

Input tensor, with dimensions .

block_size
<xref:int>

Integer value. This defines the size of the spatial block whose elements are moved to the depth dimension. Size of spatial dimensions (H, W) in the input tensor must be divisible by math:block_size

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

Examples


>>> np.random.seed(3)
>>> x = np.random.randint(low=0, high=100, size=(1, 4, 6)).astype(np.float32)
>>> a = C.input_variable((1, 4, 6))
>>> s2d_op = C.space_to_depth(a, block_size=2)
>>> s2d_op.eval({a:x})
array([[[[ 24.,  56.,   0.],
         [ 96.,  44.,  39.]],
<BLANKLINE>
        [[  3.,  72.,  21.],
         [ 20.,  93.,  14.]],
<BLANKLINE>
        [[ 19.,  41.,  21.],
         [ 26.,  90.,  66.]],
<BLANKLINE>
        [[ 74.,  10.,  38.],
         [ 81.,  22.,   2.]]]], dtype=float32)
See also

[1] W. Shi et. al. : Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network.

splice

Concatenate the input tensors along an axis.

splice(*inputs, **kw_axis_name)

Parameters

inputs

one or more input tensors

axis
<xref:int> or Axis, <xref:optional>, <xref:keyword only>

axis along which the concatenation will be performed

name
<xref:str>, <xref:optional>, <xref:keyword only>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> # create 2x2 matrix in a sequence of length 1 in a batch of one sample
>>> data1 = np.asarray([[[1, 2],
...                      [4, 5]]], dtype=np.float32)

>>> x = C.constant(value=data1)
>>> # create 3x2 matrix in a sequence of length 1 in a batch of one sample
>>> data2 = np.asarray([[[10, 20],
...                       [30, 40],
...                       [50, 60]]],dtype=np.float32)
>>> y = C.constant(value=data2)
>>> # splice both inputs on axis=0 returns a 5x2 matrix
>>> C.splice(x, y, axis=1).eval()
array([[[  1.,   2.],
        [  4.,   5.],
        [ 10.,  20.],
        [ 30.,  40.],
        [ 50.,  60.]]], dtype=float32)

sqrt

Computes the element-wise square-root of x:

Note

CNTK returns zero for sqrt of negative nubmers, this will be changed to

return NaN

sqrt(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

Examples


>>> C.sqrt([0., 4.]).eval()
array([ 0.,  2.], dtype=float32)

square

Computes the element-wise square of x:

square(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.square([1., 10.]).eval()
array([   1.,  100.], dtype=float32)

squeeze

Removes axes whose size is 1. If axes is specified, and any of their size is not 1 an exception will be raised.

squeeze(x, axes=None, name='')

Parameters

x

input tensor

axes

The axes to squeeze out (default: all static axes).

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x0 = np.arange(12).reshape((2, 2, 1, 3)).astype('f')
>>> x = C.input_variable((2, 1, 3))
>>> C.squeeze(x).eval({x: x0})
array([[[  0.,   1.,   2.],
        [  3.,   4.,   5.]],
<BLANKLINE>
       [[  6.,   7.,   8.],
        [  9.,  10.,  11.]]], dtype=float32)

stop_gradient

Outputs its input as it is and prevents any gradient contribution from its output to its input.

stop_gradient(input, name='')

Parameters

input

class:~cntk.ops.functions.Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

straight_through_impl

element-wise binarization node using the straight through estimator

straight_through_impl(x, name='')

Parameters

inputs

one input tensor

name
<xref:str>, <xref:optional>, <xref:keyword only>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> # create (1,3) matrix
>>> data = np.asarray([[-3, 4, -2]], dtype=np.float32)
>>> x = C.input_variable((3))
>>> C.straight_through_impl(x).eval({x:data})
array([[-1., 1., -1.]], dtype=float32)

sum

Create a new Function instance that computes element-wise sum of input tensors.

sum(*operands, **kw_name)

Parameters

operands
<xref:list>

list of functions

name
<xref:str>, <xref:optional>

the name of the sum Function in the network

Returns

cntk.ops.functions.Function

Examples


>>> in1_data = np.asarray([[1., 2., 3., 4.]], np.float32)
>>> in2_data = np.asarray([[0., 5., -3., 2.]], np.float32)
>>> in1 = C.input_variable(np.shape(in1_data))
>>> in2 = C.input_variable(np.shape(in2_data))
>>> C.sum([in1, in2]).eval({in1: in1_data, in2: in2_data})
array([[[ 1.,  7.,  0.,  6.]]], dtype=float32)

swapaxes

Swaps two axes of the tensor. The output tensor has the same data but with axis1 and axis2 swapped.

swapaxes(x, axis1=0, axis2=1, name='')

Parameters

x

tensor to be transposed

axis1
<xref:int> or Axis

the axis to swap with axis2

axis2
<xref:int> or Axis

the axis to swap with axis1

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.swapaxes([[[0,1],[2,3],[4,5]]], 1, 2).eval()
array([[[ 0.,  2.,  4.],
        [ 1.,  3.,  5.]]], dtype=float32)

tan

Computes the element-wise tangent of x:

The output tensor has the same shape as x.

tan(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> np.round(C.tan([-1, 0, 1]).eval(), 5)
array([-1.55741,  0.     ,  1.55741], dtype=float32)

tanh

Computes the element-wise tanh of x:

The output tensor has the same shape as x.

tanh(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.tanh([[1,2],[3,4]]).eval()
array([[ 0.761594,  0.964028],
       [ 0.995055,  0.999329]], dtype=float32)

times

The output of this operation is the matrix product of the two input matrices. It supports broadcasting. Sparse is supported in the left operand, if it is a matrix. The operator '@' has been overloaded such that in Python 3.5 and later X @ W equals times(X, W).

For better performance on times operation on sequence which is followed by sequence.reduce_sum, use infer_input_rank_to_map=TIMES_REDUCE_SEQUENCE_AXIS_WITHOUT_INFERRED_INPUT_RANK, i.e. replace following:


   sequence.reduce_sum(times(seq1, seq2))

with:


   times(seq1, seq2, infer_input_rank_to_map=TIMES_REDUCE_SEQUENCE_AXIS_WITHOUT_INFERRED_INPUT_RANK)
times(left, right, output_rank=1, infer_input_rank_to_map=-1, name='')

Parameters

left

left side matrix or tensor

right

right side matrix or tensor

output_rank
<xref:int>

in case we have tensors as arguments, output_rank represents the number of axes to be collapsed in order to transform the tensors into matrices, perform the operation and then reshape back (explode the axes)

infer_input_rank_to_map
<xref:int>

meant for internal use only. Always use default value

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> C.times([[1,2],[3,4]], [[5],[6]]).eval()
array([[ 17.],
       [ 39.]], dtype=float32)

>>> C.times(1.*np.reshape(np.arange(8), (2,2,2)),1.*np.reshape(np.arange(8), (2,2,2)), output_rank=1).eval()
array([[ 28.,  34.],
       [ 76.,  98.]])

>>> C.times(1.*np.reshape(np.arange(8), (2,2,2)),1.*np.reshape(np.arange(8), (2,2,2)), output_rank=2).eval()
array([[[[  4.,   5.],
         [  6.,   7.]],
<BLANKLINE>
        [[ 12.,  17.],
         [ 22.,  27.]]],
<BLANKLINE>
<BLANKLINE>
       [[[ 20.,  29.],
         [ 38.,  47.]],
<BLANKLINE>
        [[ 28.,  41.],
         [ 54.,  67.]]]])

times_transpose

The output of this operation is the product of the first (left) argument with the second (right) argument transposed. The second (right) argument must have a rank of 1 or 2. This operation is conceptually computing np.dot(left, right.T) except when right is a vector in which case the output is np.dot(left,np.reshape(right,(1,-1)).T) (matching numpy when left is a vector).

times_transpose(left, right, name='')

Parameters

left

left side tensor

right

right side matrix or vector

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> a=np.array([[1,2],[3,4]],dtype=np.float32)
>>> b=np.array([2,-1],dtype=np.float32)
>>> c=np.array([[2,-1]],dtype=np.float32)
>>> d=np.reshape(np.arange(24,dtype=np.float32),(4,3,2))
>>> print(C.times_transpose(a, a).eval())
[[  5.  11.]
 [ 11.  25.]]
>>> print(C.times_transpose(a, b).eval())
[[ 0.]
 [ 2.]]
>>> print(C.times_transpose(a, c).eval())
[[ 0.]
 [ 2.]]
>>> print(C.times_transpose(b, a).eval())
[ 0.  2.]
>>> print(C.times_transpose(b, b).eval())
[ 5.]
>>> print(C.times_transpose(b, c).eval())
[ 5.]
>>> print(C.times_transpose(c, a).eval())
[[ 0.  2.]]
>>> print(C.times_transpose(c, b).eval())
[[ 5.]]
>>> print(C.times_transpose(c, c).eval())
[[ 5.]]
>>> print(C.times_transpose(d, a).eval())
[[[   2.    4.]
  [   8.   18.]
  [  14.   32.]]
<BLANKLINE>
 [[  20.   46.]
  [  26.   60.]
  [  32.   74.]]
<BLANKLINE>
 [[  38.   88.]
  [  44.  102.]
  [  50.  116.]]
<BLANKLINE>
 [[  56.  130.]
  [  62.  144.]
  [  68.  158.]]]
>>> print(C.times_transpose(d, b).eval())
[[[ -1.]
  [  1.]
  [  3.]]
<BLANKLINE>
 [[  5.]
  [  7.]
  [  9.]]
<BLANKLINE>
 [[ 11.]
  [ 13.]
  [ 15.]]
<BLANKLINE>
 [[ 17.]
  [ 19.]
  [ 21.]]]
>>> print(C.times_transpose(d, c).eval())
[[[ -1.]
  [  1.]
  [  3.]]
<BLANKLINE>
 [[  5.]
  [  7.]
  [  9.]]
<BLANKLINE>
 [[ 11.]
  [ 13.]
  [ 15.]]
<BLANKLINE>
 [[ 17.]
  [ 19.]
  [ 21.]]]

to_batch

Concatenate the input tensor's first axis to batch axis.

to_batch(x, name='')

Parameters

x

a tensor with dynamic axis

name

(str, optional, keyword only): the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = np.arange(12).reshape((3,2,2))
>>> x = C.constant(value=data)
>>> y = C.to_batch(x)
>>> y.shape
(2, 2)

to_sequence

This function converts 'x' to a sequence using the most significant static axis [0] as the sequence axis.

The sequenceLengths input is optional; if unspecified, all sequences are assumed to be of the same length; i.e. dimensionality of the most significant static axis

to_sequence(x, sequence_lengths=None, sequence_axis_name_prefix='toSequence_', name='')

Parameters

x

the tensor (or its name) which is converted to a sequence

sequence_lengths

Optional tensor operand representing the sequence lengths. if unspecified, all sequences are assumed to be of the same length; i.e. dimensionality of the most significant static axis.

sequence_axis_name_prefix
<xref:str>, <xref:optional>

prefix of the new sequence axis name.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

to_sequence_like

This function converts 'x' to a sequence using the most significant static axis [0] as the sequence axis. The length of the sequences are obtained from the 'dynamic_axes_like' operand.

to_sequence_like(x, dynamic_axes_like, name='')

Parameters

x

the tensor (or its name) which is converted to a sequence

dynamic_axes_like

Tensor operand used to obtain the lengths of the generated sequences. The dynamic axes of the generated sequence tensor match the dynamic axes of the 'dynamic_axes_like' operand.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

Function

top_k

Computes the k largest values of the input tensor and the corresponding indices along the specified axis (default the last axis). The returned Function has two outputs. The first one contains the top k values in sorted order, and the second one contains the corresponding top k indices.

top_k(x, k, axis=-1, name='')

Parameters

x

numpy array or any Function that outputs a tensor

k
<xref:int>

number of top items to return

axis

axis along which to perform the operation (default: -1)

name
<xref:str>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x = C.input_variable(10)
>>> y = C.top_k(-x * C.log(x), 3)
>>> x0 = np.arange(10,dtype=np.float32)*0.1
>>> top = y.eval({x:x0})
>>> top_values = top[y.outputs[0]]
>>> top_indices = top[y.outputs[1]]
>>> top_indices
array([[ 4.,  3.,  5.]], dtype=float32)

transpose

Permutes the axes of the tensor. The output has the same data but the axes are permuted according to perm.

transpose(x, perm, name='')

Parameters

x

tensor to be transposed

perm
<xref:list>

the permutation to apply to the axes.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> a = np.arange(24).reshape(2,3,4).astype('f')
>>> np.array_equal(C.transpose(a, perm=(2, 0, 1)).eval(), np.transpose(a, (2, 0, 1)))
True

unpack_batch

Concatenate the input tensor's last dynamic axis to static axis. Only tensors with batch axis are supported now.

unpack_batch(x, name='')

Parameters

x

a tensor with dynamic axis

name

(str, optional, keyword only): the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> data = np.arange(12).reshape((3,2,2))
>>> x = C.input((2,2))
>>> C.unpack_batch(x).eval({x:data})
array([[[  0.,   1.],
        [  2.,   3.]],
<BLANKLINE>
       [[  4.,   5.],
        [  6.,   7.]],
<BLANKLINE>
       [[  8.,   9.],
        [ 10.,  11.]]], dtype=float32)

unpooling

Unpools the operand using information from pooling_input. Unpooling mirrors the operations performed by pooling and depends on the values provided to the corresponding pooling operation. The output should have the same shape as pooling_input. Pooling the result of an unpooling operation should give back the original input.

unpooling(operand, pooling_input, unpooling_type, unpooling_window_shape, strides=(1,), auto_padding=[False], name='')

Parameters

operand

unpooling input

pooling_input

input to the corresponding pooling operation

unpooling_type

only <xref:cntk.ops.MAX_UNPOOLING> is supported now

unpooling_window_shape

dimensions of the unpooling window

strides
<xref:default 1>

strides.

auto_padding

automatic padding flags for each input dimension.

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> img = np.reshape(np.arange(16, dtype = np.float32), [1, 4, 4])
>>> x = C.input_variable(img.shape)
>>> y = C.pooling(x, C.MAX_POOLING, (2,2), (2,2))
>>> C.unpooling(y, x, C.MAX_UNPOOLING, (2,2), (2,2)).eval({x : [img]})
array([[[[  0.,   0.,   0.,   0.],
          [  0.,   5.,   0.,   7.],
          [  0.,   0.,   0.,   0.],
          [  0.,  13.,   0.,  15.]]]], dtype=float32)

zeros_like

Creates an all-zeros tensor with the same shape and dynamic axes as x:

zeros_like(x, name='')

Parameters

x

numpy array or any Function that outputs a tensor

name
<xref:str>, <xref:optional>

the name of the Function instance in the network

Returns

cntk.ops.functions.Function

Examples


>>> x0 = np.arange(24).reshape((2, 3, 4)).astype('f')
>>> x = C.input_variable((3, 4))
>>> C.zeros_like(x).eval({x: x0})
array([[[ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.]],
<BLANKLINE>
       [[ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.]]], dtype=float32)