DM559/DM545 – Linear and Integer Programming

Broadcast

This Notebook containts further elements of numpy: Boolean Arrays, Arrays as Functions, Broadcasting, Meshgrid and 3D Plotting, Outer Product. Knowledge of these elements is not needed for the obligatory assginments and the written exam in DM559.

Boolean Arrays

Boolean arrays are simlpy arrays for which the type is a Boolean.

In [1]:
import numpy as np
M = np.array([[2,3],
              [1,4]])
N=np.array([[2,3],
            [0,0]])
M==N
Out[1]:
array([[ True,  True],
       [False, False]], dtype=bool)
In [2]:
(M==N).all() # False
(M==N).any() # True
Out[2]:
True

Boolean arrays are useful for advanced indexing:

In [3]:
M[M>2]
M[M==N]
Out[3]:
array([2, 3])

The logical operators and, or, not on arrays imply a casting from array to Boolean and therefore they cannot be used. Instead the operators &, | and ~ have been defined for logical operations on arrays.

In [4]:
a=np.array([True,False,False,True])
b=np.array([False,True,False,True])
# a and b # Error
a & b
Out[4]:
array([False, False, False,  True], dtype=bool)
In [5]:
M[(M>1) & (M<4)]
Out[5]:
array([2, 3])

Arrays as functions

A relevant way to look at arrays is as functions. For instance, a vector $[0.567, 1, 2.345, 0]$ in $\mathbb{R}^4$ may just be considered as a function from the domain set $D=\{0,1,\ldots,n-1\}$ to $\mathbb{R}$, ie, $f:D\to \mathbb{R}$ and

$$ \begin{array}{rcl} 0&\mapsto&0.567\\ 1&\mapsto&1\\ 2&\mapsto&2.345\\ 3&\mapsto&0 \end{array} $$

Similarly, an $m\times n$ matrix can be seen as a function of two parameters, $f:D_m\times D_n \to \mathbb{R}$, where $D_m$ and $D_n$ are two domains made of natural numbers from $0$ to $m-1$ and $n-1$, respectively.

This representation helps to understand also how operations are implemented in NumPy. Consider the multiplication between two arrays from the same domain and taking real values. From a function perspective, the product of the two arrays is defined as the point-wise product, ie:

$$ (f*g)(x)=f(x)*g(x) $$

From this point of view, it would be an abuse of notation summing for instance a constant with a vector, ie, $\mathbb{v}+1$ which from a functional perspective corresponds to:

$$ g(x)=f(x)+C $$

In order to maintain the point-wise structure of operations defined above, in this situation, the constants are /broadcasted/ to functions. Thus:

$$ \begin{array}{l} \bar{C}(x)=C \quad \forall x \in D\\ g(x)=f(x)+\bar{C} \end{array} $$

A more intricate example arises when building functions of several variables. Suppose for instance that from two functions of of one variable, $f$ and $g$, we construct a new function $F$ from the formula:

$$ F(x,y)=f(x)+g(y) $$

This is a valid mathematical definition. But it would not work in python. We would like to express this definition as the sum of two functions. This is possible by defining the functions $\bar{f}$ and $\bar{g}$ by

$$ \begin{array}{rl} \bar{f}(x,y)&:=f(x)\quad \forall y \\ \bar{g}(x,y)&:=g(y) \quad \forall x \end{array} $$

How does Python deal with this issue?

Broadcasting

Recall that the shape of an array is a tuple that defines the size of its dimensions. For example, an array of size $m\times n$ has the shape $s=(m,n)$.

In [6]:
I=np.identity(3)
np.shape(I)
Out[6]:
(3, 3)

Broadcasting is done automatically in NumPy. When NumPy is given two arrays of differrent shapes and is asked to perform an operation which would require the two shapes to be the same, both arrays are broadcast to a common shape.

Suppose that the two arrays have shapes $s_1$ and $s_2$. The broadcasting is performed in two steps:

  1. if the shape $s_1$ is shorter than the shape $s_2$ then ones are added on the left of the shape $s_1$. This is a reshaping.
  2. when the shapes have the same length the array is extended to match the shape $s_2$ (if possible).

In the example below, the vector 'v' is first reshaped to (1,4). Then, it is extended to (3,4) by coping the first row in the other two rows.

In [7]:
M = np.array([[11,12,13,14],
            [21,22,23,24],
            [31,32,33,34]])
v = np.array([100,200,300,400])
M + v
Out[7]:
array([[111, 212, 313, 414],
       [121, 222, 323, 424],
       [131, 232, 333, 434]])

The shapes must however match. The following would not work:

In [8]:
v=np.array([100,200,300])
# M+v # Error

Here $M$ has shape (3,4) and $v$ has shape (3,). Recall that a vector has only one dimension, and that, in the broadcasting, the reshape step 1 above, is done, by adding 1 on the left to the shape. In this case, the boradcasting would not work as (3,4) and (1,3) are not compatible for step 2. One has to do the reshaping manually, if that is the desired result.

In [9]:
M+v.reshape(-1,1)
Out[9]:
array([[111, 112, 113, 114],
       [221, 222, 223, 224],
       [331, 332, 333, 334]])

Similarly, this would not work (why?):

In [10]:
# np.arange(3) + np.arange(10) # Error

because the first has shape (3,) and the second (10,). If we decide manually to reshape (3,) to (3,1), a column vector then broadcasting (10,) to (1,10) would allow extension to (3,10):

In [11]:
np.arange(3).reshape(-1,1) + np.arange(10) # a column vector + a row vactor
Out[11]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])

For the same reason the computation of the function $F(x,y)=\cos(x)+\sin(2y)$ with $x=\{0,0.5,1\}$ and $y=\{0,0.25,0.5,0.75,1\}$ would not work:

In [12]:
x=np.linspace(0,1,3)
y=np.linspace(0,1,5)
# np.cos(x)+np.sin(2*y) # Error, diffrent shapes

We have first to reshape the first vector and make its shape (3,1) which can be broadcast with (5,)

In [13]:
np.cos(x).reshape(-1,1)+np.sin(2*y)
Out[13]:
array([[ 1.        ,  1.47942554,  1.84147098,  1.99749499,  1.90929743],
       [ 0.87758256,  1.3570081 ,  1.71905355,  1.87507755,  1.78687999],
       [ 0.54030231,  1.01972784,  1.38177329,  1.53779729,  1.44959973]])

Meshgrid and 3D plotting

The function np.meshgrid helps us to construct a grid of points which is useful, for instance, in 3D plotting. This function returns coordinate matrices from two or more coordinate vectors.

Given two arrays $\vec{x}= \begin{bmatrix} x_1&x_2&\ldots&x_m \end{bmatrix} $ and $\vec{y}= \begin{bmatrix} y_1&y_2&\ldots&y_m \end{bmatrix} $, np.meshgrid creates grid, or coordinate matrix, with all combinations of elements of the vectors. The grid is here represented as an array of tuples:

$$ \begin{bmatrix} (x_1,y_1)&(x_{2},y_{1})&\cdots&(x_{m},y_{1})\\ (x_1,y_2)&(x_2,y_2)&\cdots&(x_m,y_2)\\ \vdots&\vdots&\ddots&\vdots\\ (x_{1},y_{n})&(x_{2},y_{n})&\cdots&(x_{m},y_{n})\\ \end{bmatrix} $$

The grid is not passed as an array of tuples but as 2 two-dimensional arrays:

In [14]:
np.meshgrid(np.array([1,2,3]),np.array([4,5,6,7]))
Out[14]:
[array([[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]]), array([[4, 4, 4],
        [5, 5, 5],
        [6, 6, 6],
        [7, 7, 7]])]

We wish to plot a sphere. The Cartesian equation of a sphere centered at the point $(x_0,y_0,z_0)$ with radius $R$ is given by

$$(x-x_0)^2+(y-y_0)^2+(z-z_0)^2=R^2.$$

From this form we can isolate $z$:

$$ z=\pm\sqrt{R^2-(x-x_0)^2-(y-y_0)^2} $$

The mplot3d function plot_surface for plotting 3D surfaces uses three 2D arrays, X, Y, Z that represent the points to plot.

$$ \begin{bmatrix} (x_{11},y_{11},z_{11})&(x_{12},y_{12},z_{12})&\cdots&(x_{1n},y_{1n},z_{1n})\\ (x_{21},y_{21},z_{21})&(x_{22},y_{22},z_{22})&\cdots&(x_{2n},y_{2n},z_{2n})\\ \vdots&\vdots&\ddots&\vdots\\ (x_{n1},y_{n1},z_{n1})&(x_{n2},y_{n2},z_{n2})&\cdots&(x_{nn},y_{nn},z_{nn})\\ \end{bmatrix} $$

$$ X= \begin{bmatrix} x_{11}&x_{12}&\cdots&x_{11}\\ x_{21}&x_{22}&\cdots&x_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ x_{n1}&x_{11}&\cdots&x_{nn}\\ \end{bmatrix} \quad Y= \begin{bmatrix} y_{11}&y_{12}&\cdots&y_{11}\\ y_{21}&y_{22}&\cdots&y_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ y_{n1}&y_{11}&\cdots&y_{nn}\\ \end{bmatrix} \quad Z= \begin{bmatrix} z_{11}&z_{12}&\cdots&z_{11}\\ z_{21}&z_{22}&\cdots&z_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ z_{n1}&z_{11}&\cdots&z_{nn}\\ \end{bmatrix} $$

We will use mesh grid to create the arrays X, Y, Z for plotting.

In [15]:
%matplotlib inline
In [16]:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

x = np.arange(-5, 5, 0.1)
y = np.arange(-5, 5, 0.1)
X, Y = np.meshgrid(x, y)
Z=np.sqrt(10**2-X**2-Y**2)

fig = plt.figure()
ax = fig.add_subplot(121)
ax.contour(X,Y,Z)
ax = fig.add_subplot(122)
ax.contourf(X,Y,Z)
plt.show()

fig = plt.figure()
ax = fig.gca(projection='3d')
h = ax.plot_surface(X, Y, Z, rstride=4, cstride=4, color='b')

Outer product

We have seen earlier the definition of matrix multiplication and how it can be adapted to vector multiplication, ie, the inner product of vectors. Another way to combine arrays by multiplication is by means of the outer product, which is indicated by the symbol $\otimes$ and defined as

$$ \vec{x}\otimes \vec{y}^T=\begin{bmatrix}x_1\\x_2\\ \ldots\\x_n\end{bmatrix} \begin{bmatrix}y_1&y_2& \ldots& y_n\end{bmatrix}= \begin{bmatrix} x_1y_1&x_1y_2&\cdots&x_1y_n\\ x_2y_1&x_2y_2&\cdots&x_2y_n\\ \vdots&\vdots&\ddots&\vdots\\ x_ny_1&x_ny_2&\cdots&x_ny_n\\ \end{bmatrix} $$

Thus the result of the outer product is a new matrix in which every row is a multiple of $\begin{bmatrix}y_1&y_2& \ldots& y_n\end{bmatrix}$ and every column a multiple of $\begin{bmatrix}x_1&x_2& \ldots&x_n\end{bmatrix}$.

In [17]:
x=np.array([1,2,3])
y=np.array([4,5,6])
np.outer(x,y)
Out[17]:
array([[ 4,  5,  6],
       [ 8, 10, 12],
       [12, 15, 18]])

Back to plotting a sphere, an alternative way to specify a sphere with center at the origin is in spherical coordinates:

$$\begin{array}{rcl} x &=& \rho\cos\theta\sin\phi \\ y &=& \rho\sin\theta\sin\phi \\ z &=& \rho\cos\phi \end{array} $$

where $\theta$ is an azimuthal parameter running from 0 to $2\pi$ (longitude), $\phi$ is a polar coordinate running from 0 to $\pi$ (colatitude), and $\rho$ is the radius.

Here we can see each variable x,y,z as the product of the other two parameters, and we can use outer to construct the 2D arrays X, Y, Z as follows:

In [18]:
fig = plt.figure()
ax = fig.gca(projection='3d')

# theta is an array from 0 to 2*pi, with 100 elements
theta = np.linspace(0, 2 * np.pi, 100)
# phi is an array from 0 to 2*pi, with 100 elements
phi = np.linspace(0, np.pi, 100)
rho = 10
# x, y, and z are the coordinates of the points for plotting
# each is arranged in a 100x100 array
X=rho*np.outer(np.cos(theta), np.sin(phi))
Y=rho*np.outer(np.sin(theta), np.sin(phi))
Z=rho*np.outer(np.ones(np.size(theta)), np.cos(phi))
h = ax.plot_surface(X, Y, Z,  rstride=4, cstride=4, color='b')

Both outer and meshgrid combine all elements of two vectors to create a matrix. While meshgrid leaves the elements in form of tuples, outer multiplies them to form a single value. Note that only the outer product is defined mathematically.