# Apache MXNet - Python API Symbol

In this chapter, we will learn about an interface in MXNet which is termed as Symbol.

## Mxnet.ndarray

Apache MXNet’s Symbol API is an interface for symbolic programming. Symbol API features the use of the following −

• Computational graphs

• Reduced memory usage

• Pre-use function optimization

The example given below shows how one can create a simple expression by using MXNet’s Symbol API −

An NDArray by using 1-D and 2-D ‘array’ from a regular Python list −

import mxnet as mx
# Two placeholders namely x and y will be created with mx.sym.variable
x = mx.sym.Variable('x')
y = mx.sym.Variable('y')
# The symbol here is constructed using the plus ‘+’ operator.
z = x + y


Output

You will see the following output −

<Symbol _plus0>


Example

(x, y, z)


Output

The output is given below −

(<Symbol x>, <Symbol y>, <Symbol _plus0>)


Now let us discuss in detail about the classes, functions, and parameters of ndarray API of MXNet.

### Classes

Following table consists of the classes of Symbol API of MXNet −

Class Definition
Symbol(handle) This class namely symbol is the symbolic graph of the Apache MXNet.

### Functions and their parameters

Following are some of the important functions and their parameters covered by mxnet.Symbol API −

Function and its Parameters Definition
Activation([data, act_type, out, name]) It applies an activation function element-wise to the input. It supports relu, sigmoid, tanh, softrelu, softsign activation functions.
BatchNorm([data, gamma, beta, moving_mean, …]) It is used for batch normalization. This function normalizes a data batch by mean and variance. It applies a scale gamma and offset beta.
BilinearSampler([data, grid, cudnn_off, …]) This function applies bilinear sampling to input feature map. Actually it is the key of “Spatial Transformer Networks”. If you are familiar with remap function in OpenCV, the usage of this function is quite similar to that. The only difference is that it has the backward pass.
BlockGrad([data, out, name]) As name specifies, this function stops gradient computation. It basically stops the accumulated gradient of the inputs from flowing through this operator in backward direction.
cast([data, dtype, out, name]) This function will cast all elements of the input to a new type.
This function will cast all elements of the input to a new type. This function, as name specified, returns a new symbol of given shape and type, filled with zeros.
ones(shape[, dtype]) This function, as name specified return a new symbol of given shape and type, filled with ones.
full(shape, val[, dtype]) This function, as name specified returns a new array of given shape and type, filled with the given value val.
arange(start[, stop, step, repeat, …]) It will return evenly spaced values within a given interval. The values are generated within half open interval [start, stop) which means that the interval includes start but excludes stop.
linspace(start, stop, num[, endpoint, name, …]) It will return evenly spaced numbers within a specified interval. Similar to the function arrange(), the values are generated within half open interval [start, stop) which means that the interval includes start but excludes stop.
histogram(a[, bins, range]) As name implies, this function will compute the histogram of the input data.
power(base, exp) As name implies, this function will return element-wise result of base element raised to powers from exp element. Both inputs i.e. base and exp, can be either Symbol or scalar. Here note that broadcasting is not allowed. You can use broadcast_pow if you want to use the feature of broadcast.
SoftmaxActivation([data, mode, name, attr, out]) This function applies softmax activation to input. It is intended for internal layers. It is actually deprecated, we can use softmax() instead.

### Implementation Examples

In the example below we will be using the function power() which will return element-wise result of base element raised to the powers from exp element:

import mxnet as mx
mx.sym.power(3, 5)


Output

You will see the following output −

243


Example

x = mx.sym.Variable('x')
y = mx.sym.Variable('y')
z = mx.sym.power(x, 3)
z.eval(x=mx.nd.array([1,2]))[0].asnumpy()


Output

This produces the following output −

array([1., 8.], dtype=float32)


Example

z = mx.sym.power(4, y)
z.eval(y=mx.nd.array([2,3]))[0].asnumpy()


Output

When you execute the above code, you should see the following output −

array([16., 64.], dtype=float32)


Example

z = mx.sym.power(x, y)
z.eval(x=mx.nd.array([4,5]), y=mx.nd.array([2,3]))[0].asnumpy()


Output

The output is mentioned below −

array([ 16., 125.], dtype=float32)


In the example given below, we will be using the function SoftmaxActivation() (or softmax()) which will be applied to input and is intended for internal layers.

input_data = mx.nd.array([[2., 0.9, -0.5, 4., 8.], [4., -.7, 9., 2., 0.9]])
soft_max_act = mx.nd.softmax(input_data)
print (soft_max_act.asnumpy())


Output

You will see the following output −

[[2.4258138e-03 8.0748333e-04 1.9912292e-04 1.7924475e-02 9.7864312e-01]
[6.6843745e-03 6.0796250e-05 9.9204916e-01 9.0463174e-04 3.0112563e-04]]


## symbol.contrib

The Contrib NDArray API is defined in the symbol.contrib package. It typically provides many useful experimental APIs for new features. This API works as a place for the community where they can try out the new features. The feature contributor will get the feedback as well.

### Functions and their parameters

Following are some of the important functions and their parameters covered by mxnet.symbol.contrib API

Function and its Parameters Definition
rand_zipfian(true_classes, num_sampled, …) This function draws random samples from an approximately Zipfian distribution. The base distribution of this function is Zipfian distribution. This function randomly samples num_sampled candidates and the elements of sampled_candidates are drawn from the base distribution given above.
foreach(body, data, init_states) As name implies, this function runs a loop with user-defined computation over NDArrays on dimension 0. This function simulates a for loop and body has the computation for an iteration of the for loop.
while_loop(cond, func, loop_vars[, …]) As name implies, this function runs a while loop with user-defined computation and loop condition. This function simulates a while loop that literately does customized computation if the condition is satisfied.
cond(pred, then_func, else_func) As name implies, this function run an if-then-else using user-defined condition and computation. This function simulates an if-like branch which chooses to do one of the two customized computations according to the specified condition.
getnnz([data, axis, out, name]) This function gives us the number of stored values for a sparse tensor. It also includes explicit zeros. It only supports CSR matrix on CPU.
requantize([data, min_range, max_range, …]) This function requantize the given data that is quantized in int32 and the corresponding thresholds, into int8 using min and max thresholds either calculated at runtime or from calibration.
index_copy([old_tensor, index_vector, …]) This function copies the elements of a new_tensor into the old_tensor by selecting the indices in the order given in index. The output of this operator will be a new tensor that contains the rest elements of old tensor and the copied elements of new tensor.
interleaved_matmul_encdec_qk([queries, …]) This operator compute the matrix multiplication between the projections of queries and keys in multi-head attention use as encoder-decoder. The condition is that the inputs should be a tensor of projections of queries that follows the layout: (seq_length, batch_size, num_heads*, head_dim).

### Implementation Examples

In the example below we will be using the function rand_zipfian for drawing random samples from an approximately Zipfian distribution −

import mxnet as mx
true_cls = mx.sym.Variable('true_cls')
samples, exp_count_true, exp_count_sample = mx.sym.contrib.rand_zipfian(true_cls, 5, 6)
samples.eval(true_cls=mx.nd.array([3]))[0].asnumpy()


Output

You will see the following output −

array([4, 0, 2, 1, 5], dtype=int64)


Example

exp_count_true.eval(true_cls=mx.nd.array([3]))[0].asnumpy()


Output

The output is mentioned below −

array([0.57336551])


Example

exp_count_sample.eval(true_cls=mx.nd.array([3]))[0].asnumpy()


Output

You will see the following output −

array([1.78103594, 0.46847373, 1.04183923, 0.57336551, 1.04183923])


In the example below we will be using the function while_loop for running a while loop for user-defined computation and loop condition −

cond = lambda i, s: i <= 7
func = lambda i, s: ([i + s], [i + 1, s + i])
loop_vars = (mx.sym.var('i'), mx.sym.var('s'))
outputs, states = mx.sym.contrib.while_loop(cond, func, loop_vars, max_iterations=10)
print(outputs)


Output

The output is given below:

[<Symbol _while_loop0>]


Example

Print(States)


Output

This produces the following output −

[<Symbol _while_loop0>, <Symbol _while_loop0>]


In the example below we will be using the function index_copy that copies the elements of new_tensor into the old_tensor.

import mxnet as mx
a = mx.nd.zeros((6,3))
b = mx.nd.array([[1,2,3],[4,5,6],[7,8,9]])
index = mx.nd.array([0,4,2])
mx.nd.contrib.index_copy(a, index, b)


Output

When you execute the above code, you should see the following output −

[[1. 2. 3.]
[0. 0. 0.]
[7. 8. 9.]
[0. 0. 0.]
[4. 5. 6.]
[0. 0. 0.]]
<NDArray 6x3 @cpu(0)>


## symbol.image

The Image Symbol API is defined in the symbol.image package. As name implies, it typically used for images and their features.

### Functions and their parameters

Following are some of the important functions and their parameters covered by mxnet.symbol.image API

Function and its Parameters Definition
adjust_lighting([data, alpha, out, name]) As name implies, this function adjusts the lighting level of the input. It follows the AlexNet style.
crop([data, x, y, width, height, out, name]) With the help of this function we can crop an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by user.
normalize([data, mean, std, out, name]) It will normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation(SD).
random_crop([data, xrange, yrange, width, …]) Similar to crop(), it randomly crop an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by the user. It will upsample the result if src is smaller than the size.
random_lighting([data, alpha_std, out, name]) As name implies, this function adds the PCA noise randomly. It also follows the AlexNet style.
random_resized_crop([data, xrange, yrange, …]) It also crops an image randomly NDArray of shape (H x W x C) or (N x H x W x C) to the given size. It will upsample the result if src is smaller than the size. It will randomize the area and aspect ration as well.
resize([data, size, keep_ratio, interp, …]) As name implies, this function will resize an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by user.
to_tensor([data, out, name]) It converts an image NDArray of shape (H x W x C) or (N x H x W x C) with the values in the range [0, 255] to a tensor NDArray of shape (C x H x W) or (N x C x H x W) with the values in the range [0, 1].

### Implementation Examples

In the example below, we will be using the function to_tensor to convert image NDArray of shape (H x W x C) or (N x H x W x C) with the values in the range [0, 255] to a tensor NDArray of shape (C x H x W) or (N x C x H x W) with the values in the range [0, 1].

import numpy as np

img = mx.sym.random.uniform(0, 255, (4, 2, 3)).astype(dtype=np.uint8)

mx.sym.image.to_tensor(img)


Output

The output is stated below −

<Symbol to_tensor4>


Example

img = mx.sym.random.uniform(0, 255, (2, 4, 2, 3)).astype(dtype=np.uint8)

mx.sym.image.to_tensor(img)


Output

The output is mentioned below:

<Symbol to_tensor5>


In the example below, we will be using the function normalize() to normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation(SD).

img = mx.sym.random.uniform(0, 1, (3, 4, 2))

mx.sym.image.normalize(img, mean=(0, 1, 2), std=(3, 2, 1))


Output

Given below is the output of the code −

<Symbol normalize0>


Example

img = mx.sym.random.uniform(0, 1, (2, 3, 4, 2))

mx.sym.image.normalize(img, mean=(0, 1, 2), std=(3, 2, 1))


Output

The output is shown below −

<Symbol normalize1>


## symbol.random

The Random Symbol API is defined in the symbol.random package. As name implies, it is random distribution generator Symbol API of MXNet.

### Functions and their parameters

Following are some of the important functions and their parameters covered by mxnet.symbol.random API

Function and its Parameters Definition
uniform([low, high, shape, dtype, ctx, out]) It generates random samples from a uniform distribution.
normal([loc, scale, shape, dtype, ctx, out]) It generates random samples from a normal (Gaussian) distribution.
randn(*shape, **kwargs) It generates random samples from a normal (Gaussian) distribution.
poisson([lam, shape, dtype, ctx, out]) It generates random samples from a Poisson distribution.
exponential([scale, shape, dtype, ctx, out]) It generates samples from an exponential distribution.
gamma([alpha, beta, shape, dtype, ctx, out]) It generates random samples from a gamma distribution.
multinomial(data[, shape, get_prob, out, dtype]) It generates concurrent sampling from multiple multinomial distributions.
negative_binomial([k, p, shape, dtype, ctx, out]) It generates random samples from a negative binomial distribution.
generalized_negative_binomial([mu, alpha, …]) It generates random samples from a generalized negative binomial distribution.
shuffle(data, **kwargs) It shuffles the elements randomly.
randint(low, high[, shape, dtype, ctx, out]) It generates random samples from a discrete uniform distribution.
exponential_like([data, lam, out, name]) It generates random samples from an exponential distribution according to the input array shape.
gamma_like([data, alpha, beta, out, name]) It generates random samples from a gamma distribution according to the input array shape.
generalized_negative_binomial_like([data, …]) It generates random samples from a generalized negative binomial distribution according to the input array shape.
negative_binomial_like([data, k, p, out, name]) It generates random samples from a negative binomial distribution according to the input array shape.
normal_like([data, loc, scale, out, name]) It generates random samples from a normal (Gaussian) distribution according to the input array shape.
poisson_like([data, lam, out, name]) It generates random samples from a Poisson distribution according to the input array shape.
uniform_like([data, low, high, out, name]) It generates random samples from a uniform distribution according to the input array shape.

### Implementation Examples

In the example below, we are going to shuffle the elements randomly using shuffle() function. It will shuffle the array along the first axis.

data = mx.nd.array([[0, 1, 2], [3, 4, 5], [6, 7, 8],[9,10,11]])
x = mx.sym.Variable('x')
y = mx.sym.random.shuffle(x)
y.eval(x=data)


Output

You will see the following output:

[
[[ 9. 10. 11.]
[ 0. 1. 2.]
[ 6. 7. 8.]
[ 3. 4. 5.]]
<NDArray 4x3 @cpu(0)>]


Example

y.eval(x=data)


Output

When you execute the above code, you should see the following output −

[
[[ 6. 7. 8.]
[ 0. 1. 2.]
[ 3. 4. 5.]
[ 9. 10. 11.]]
<NDArray 4x3 @cpu(0)>]


In the example below, we are going to draw random samples from a generalized negative binomial distribution. For this will be using the function generalized_negative_binomial().

mx.sym.random.generalized_negative_binomial(10, 0.1)


Output

The output is given below −

<Symbol _random_generalized_negative_binomial0>


## symbol.sparse

The Sparse Symbol API is defined in the mxnet.symbol.sparse package. As name implies, it provides sparse neural network graphs and auto-differentiation on CPU.

### Functions and their parameters

Following are some of the important functions (includes Symbol creation routines, Symbol Manipulation routines, Mathematical functions, Trigonometric function, Hyberbolic functions, Reduce functions, Rounding, Powers, Neural Network) and their parameters covered by mxnet.symbol.sparse API

Function and its Parameters Definition
ElementWiseSum(*args, **kwargs) This function will add all input arguments element wise. For example, 𝑎𝑑𝑑_𝑛(𝑎1,𝑎2,…𝑎𝑛=𝑎1+𝑎2+⋯+𝑎𝑛). Here, we can see that add_n is potentially more efficient than calling add by n times.
Embedding([data, weight, input_dim, …]) It will map the integer indices to vector representations i.e. embeddings. It actually maps words to real-valued vectors in high-dimensional space which is called word embeddings.
LinearRegressionOutput([data, label, …]) It computes and optimizes for squared loss during backward propagation giving just output data during forward propagation.
LogisticRegressionOutput([data, label, …]) Applies a logistic function which is also called the sigmoid function to the input. The function is computed as 1/1+exp (−x).
MAERegressionOutput([data, label, …]) This operator computes mean absolute error of the input. MAE is actually a risk metric corresponding to the expected value of absolute error.
abs([data, name, attr, out]) As name implies, this function will return element-wise absolute value of the input.
add_n(*args, **kwargs) As name implies it will adds all input arguments element-wise.
arccos([data, name, attr, out]) This function will returns element-wise inverse cosine of the input array.
dot([lhs, rhs, transpose_a, transpose_b, …]) As name implies, it will give the dot product of two arrays. It will depend upon the input array dimension: 1-D: inner product of vectors 2-D: matrix multiplication N-D: A sum product over the last axis of the first input and the first axis of the second input.
elemwise_add([lhs, rhs, name, attr, out]) As name implies it will add arguments element wise.
elemwise_div([lhs, rhs, name, attr, out]) As name implies it will divide arguments element wise.
elemwise_mul([lhs, rhs, name, attr, out]) As name implies it will Multiply arguments element wise.
elemwise_sub([lhs, rhs, name, attr, out]) As name implies it will Subtract arguments element wise.
exp([data, name, attr, out]) This function will return element wise exponential value of the given input.
sgd_update([weight, grad, lr, wd, …]) It acts as an update function for Stochastic Gradient Descent optimizer.
sigmoid([data, name, attr, out]) As name implies it will compute sigmoid of x element wise.
sign([data, name, attr, out]) It will return the element wise sign of the given input.
sin([data, name, attr, out]) As name implies, this function will computes the element wise sine of the given input array.

### Implementation Example

In the example below, we are going to shuffle the elements randomly using ElementWiseSum() function. It will map integer indices to vector representations i.e. word embeddings.

input_dim = 4
output_dim = 5


Example

/* Here every row in weight matrix y represents a word. So, y = (w0,w1,w2,w3)
y = [[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[ 10., 11., 12., 13., 14.],
[ 15., 16., 17., 18., 19.]]
/* Here input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)]
x = [[ 1., 3.],
[ 0., 2.]]
/* Now, Mapped input x to its vector representation y.
Embedding(x, y, 4, 5) = [[[ 5., 6., 7., 8., 9.],
[ 15., 16., 17., 18., 19.]],
[[ 0., 1., 2., 3., 4.],
[ 10., 11., 12., 13., 14.]]]