Python and NumPy: Basic Statistics - Windows 11 Installation Guides

This guide is obsolete, it was written during the early stages when I was learning python. A more up to date guide is available here:

numpy: the numeric python library

Other python guides including installation of python are available in:

python: an introduction to scientific programming

Table of contents

Perquisites
Function sort and argsort
1. Function sort and argsort – sort values in a Vector or find the argument indexes of lowest to highest values
2. Function sort and argsort Sort or find the argument indexes of lowest to highest values for either all Elements in a Matrix or across columns or rows
Function amax, argmax and maximum
Function amin, argmin and minimum
Function sum
1. Function sum Find the Sum of all Elements in a Vector
2. Function sum Find the Sum of all Elements in a Matrix or across columns or rows
Function prod
1. Function prod Find the Product of all Elements in a Vector
2. Function prod Find the Product of all Elements in a Matrix or across columns or rows
Function mean
1. Function mean Find the Mean or Average of all Elements in a Vector
2. Function mean Find the Mean or Average of all Elements in a Matrix or across columns or rows
Function median
1. Function mean Find the Mean or Average of all Elements in a Vector
2. Function median Find the Median of all Elements in a Matrix or across columns or rows
Function mode
1. Function mode Find the Mode of all Elements in a Vector
2. Function mode Find the Mode of all Elements in a Matrix or across columns or rows
Function var
1. Function var Find the Variance of all Elements in a Vector
2. Function var Find the Variance of all Elements in a Matrix or across columns or rows
Function std
1. Function std – Find the Standard Deviation of all Elements in a Vector
2. Function std – Find the Standard Deviation of all Elements in a Matrix or across columns or rows
Function cumsum
1. Function cumsum – Find the Cumulative Sum of all Elements in a Vector
2. Function cumsum – Find the Cumulative Sum of all Elements in a Matrix or across columns or rows
Function cumprod
1. Function cumprod – Find the Cumulative Product of all Elements in a Vector
2. Function cumprod – Find the Cumulative Product of all Elements in a Matrix or across columns or rows
Function diff
1. Function diff – Find the Difference of all Elements in a Vector
2. Function diff – Find the Difference of all Elements in a Matrix or across columns or rows

Perquisites

In this guide, we are going to use the numeric Python library NumPy. To use it we'll import it as np as standard:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Function sort and argsort

Function sort and argsort – sort values in a Vector or find the argument indexes of lowest to highest values

Let's us create a vector:

$\displaystyle \text{V}=\left[ {\begin{array}{*{20}{c}} 3 \\ 1 \\ 2 \end{array}} \right]$

We can use the functions sort and argsort to sort the values from lowest to highest or alternatively to list the order of the arguments from lowest to highest.

v=np.array([1,3,2])
print(v)
vsorted=np.sort(v)
print(vsorted)

[1 3 2]

[1 2 3]

sort listed the values from lowest to highest. If we want highest to lowest we can flip the result.

v=np.array([1,3,2])
print(v)
vsorted=np.flip(np.sort(v))
print(vsorted)

[1 3 2]

[3 2 1]

v=np.array([1,3,2])
print(v)
vargsorted=np.argsort(v)
print(vargsorted)

[1 3 2]

[0 2 1]

argsort, tells us from v, the element with the lowest value. In this case it is 1 at index 0, then the second lowest value, in this case it is 1 at index 2 and finally the third lowest value, in this case it is 3 at index 1.

Function sort and argsort Sort or find the argument indexes of lowest to highest values for either all Elements in a Matrix or across columns or rows

Let's create a new matrix:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 3 & 6 & 2 \\ 8 & 1 & 4 \\ 5 & 9 & 7 \end{array}} \right]$

M=np.array([[3,6,2],[8,1,4],[5,9,7]])
print(M)

[[3 6 2]
 [8 1 4]
 [5 9 7]]

We can sort out all elements in this matrix by flattening it to convert it to a vector. In essence, the second row is added to the end of the first row and the third row is added to the end of the second row and so on and so forth.

Mflattened=np.ndarray.flatten(M)
print(Mflattened)

[3 6 2 8 1 4 5 9 7]

Then we can use sort and sort out the flattened matrix (which is now a vector).

Msortedall=np.sort(Mflattened)
print(Msortedall)

[1 2 3 4 5 6 7 8 9]

If we wanted the reverse order we could flip this result as we done earlier for the row vector.

Alternatively we can use argsort to get the index of the lowest to highest value:

print(Margsortedall)
Margsortedall=np.argsort(Mflattened)
print(Margsortedall)

[3 6 2 8 1 4 5 9 7]

[4 2 0 5 6 1 8 3 7]

We can look at sorting a matrix by columns or rows. This is done by using the input argument argument axis. An axis of 0 will work on columns and an axis of 1 will work on rows. Many numpy functions will work on the entire matrix without an axis argument, (others will default to axis 0 and the matrix needs to be flattened to work on the entire matrix). Let's look at Columns:

print(M)
Msortcols=np.sort(M,axis=0)
print(Msortcols)

[[3 6 2]
 [8 1 4]
 [5 9 7]]

[[3 1 2]
 [5 6 4]
 [8 9 7]]

If we wanted the result in reverse order, we could use the function flipud to slip the columns:

Msortcolsud=np.flipud(Msortcols)
print(Msortcolsud)

[[8 9 7]
 [5 6 4]
 [3 1 2]]

Now let's look at Rows:

print(M)
Msortrows=np.sort(M,axis=1)
print(Msortrows)

[[3 6 2]
 [8 1 4]
 [5 9 7]]

[[2 3 6]
 [1 4 8]
 [5 7 9]]

This time if we want the reverse order we can use the function fliplr to flip the rows:

Msortrowslr=np.fliplr(Msortrows)
print(Msortrowslr)

[[6 3 2]
 [8 4 1]
 [9 7 5]]

We can now also have a look at using argsort, for each column. Note that it has the same form as sort.

print(M)
Margsortcols=np.argsort(M,axis=0)
print(Margsortcols)

[[3 6 2]
 [8 1 4]
 [5 9 7]]

[[0 1 0]
 [2 0 1]
 [1 2 2]]

Now we see that each column has an index 0, 1 and 2 where 0 is the lowest value in each column, 1 the next lowest value in column and 2 the next lowest value. We can see that although each column has a 0,1 and 2 value, they are not all in the same place.

We can work on rows:

print(M)
Margsortrows=np.argsort(M,axis=1)
print(Margsortrows)

[[3 6 2]
 [8 1 4]
 [5 9 7]]

[[2 0 1]
 [1 2 0]
 [0 2 1]]

This time we see there is a 0,1 and 2 in each row.

Function amax, argmax and maximum

Find the Maximum Value Between Two Scalars, Single Vector or Two Vectors

Lets have a look at two scalars and compare them to see what one is maximum. To this we can use the function amax.

$\displaystyle {\text{s1}}=1$

$\displaystyle {\text{s2}}={\mathbf{2}}$

s1=1
s2=2
smax=np.amax([s1,s2],axis=0)
print(smax)

$\displaystyle \text{smax}=\text{amax}\left[ {1,{\mathbf{2}}} \right]$

For a scalar there is only a single axis, the 0 axis. Note [s1,s2] is also a vector so we employ the same method to find the maximum value in a single vector. Lets create:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & {\mathbf{3}} \end{array}} \right]$

v1=[1,2,3]
maxinv1=np.amax(v1,axis=0)
print(maxinv1)

Or combining the top two lines into a single line:

maxinv1=np.amax([1,2,3],axis=0)

We can instead compare two vectors:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} 1 & {\mathbf{2}} & 3 \end{array}} \right]$

$\displaystyle {\text{v2}}=\left[ {\begin{array}{*{20}{c}} {\mathbf{2}} & 1 & {\mathbf{4}} \end{array}} \right]$

v1=[1,2,3]
v2=[2,1,4]
vmax=np.amax([v1,v2],axis=0)
print(vmax)

$\displaystyle \text{vmax}=\left[ {\begin{array}{*{20}{c}} {\text{amax}\left[ {1,{\mathbf{2}}} \right]} & {\text{amax}\left[ {{\mathbf{2}},1} \right]} & {\text{amax}\left[ {3,{\mathbf{4}}} \right]} \end{array}} \right]$

[2 2 4]

Function amax Find the Maximum in Each Row or Column of a Matrix

Lets create a matrix M where:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} 1 & 4 & 5 \\ 7 & {10} & {11} \\ {\mathbf{13}} & {\mathbf{15}} & {\mathbf{18}} \end{array}} \right]$

M=np.array([[1,4,5],
            [7,10,11],
            [13,15,18]])

We can calculate the maximum in each column using the following. Note that we set axis=0, so we operate on each column:

Mmaxineachcol=np.amax(M,axis=0)
print(Mmaxineachcol)

[13 15 18]

We can change the axis to 1 to instead find the maximum in each row:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} 1 & 4 & {\mathbf{5}} \\ 7 & {10} & {\mathbf{11}} \\ {13} & {15} & {\mathbf{18}} \end{array}} \right]$

Mmaxineachrow=np.amax(M,axis=1)
print(Mmaxineachrow)

[ 5 11 18]

Function argmax Find the Index of the Maximum Value of a Vector

Returning to the case of the single vector we seen before, we may be interested in knowing not only the maximum value, but also the location of the maximum value. To do this we can use the function argmax which has a similar form to amax:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & {\mathbf{3}} \end{array}} \right]$

v1=[1,2,3]
maxinv1=np.amax(v1,axis=0)
maxlocv1=np.argmax(v1,axis=0)
print(maxinv1)
print(maxlocv1)

3

2

The maximum value is 3, at index 2 (recall that we are using zero order indexing).

v1[2]

Function argmax Find the Index of the Maximum Values for each Column or Row in a Matrix

Lets return to matrix M where:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} 1 & 4 & 5 \\ 7 & {10} & {11} \\ {\mathbf{13}} & {\mathbf{15}} & {\mathbf{18}} \end{array}} \right]$

We can calculate the maximum in each column using the same line as code as before. Note that once again we set axis=0, so we operate on each column. The function argmax takes the same form as amax:

Mmaxineachcol=np.amax(M,axis=0)
Mmaxineachcolloc=np.argmax(M,axis=0)
print(Mmaxineachcol)
print(Mmaxineachcolloc)

[13 15 18]

[2 2 2]

In this case, the maximum values are all in position 2, recall once again that we were using 0 order indexing.

We can change the axis to 1 to instead find the maximum and maximum location in each row:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} 1 & 4 & {\mathbf{5}} \\ 7 & {10} & {\mathbf{11}} \\ {13} & {15} & {\mathbf{18}} \end{array}} \right]$

Mmaxineachrow=np.amax(M,axis=1)
print(Mmaxineachrow)
Mmaxineachrowloc=np.argmax(M,axis=1)
print(Mmaxineachrowloc)

[ 5 11 18]

[2 2 2]

Function maximum Find the Maximum Value of Two Matrices

Let's create two matrices M and N where:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & {\mathbf{4}} & 5 \\ 7 & {\mathbf{10}} & {11} \\ {13} & {15} & {\mathbf{18}} \end{array}} \right]$

$\displaystyle \text{N}=\left[ {\begin{array}{*{20}{c}} {\mathbf{2}} & 3 & {\mathbf{6}} \\ {\mathbf{8}} & 9 & {\mathbf{12}} \\ {\mathbf{14}} & {\mathbf{16}} & {17} \end{array}} \right]$

To create these arrays in Python, we can use:

M=np.array([[1,4,5],
            [7,10,11],
            [13,15,18]])
N=np.array([[2,3,6],
            [8,9,12],
            [14,16,17]])

To find the maximum value between the two matrices

maxmatrix=np.maximum(M,N)
print(maxmatrix)

$\displaystyle \text{maximum(M,N)=}\left[ {\begin{array}{*{20}{c}} {\text{amax(1,}}{\mathbf{2)}} & {\text{amax(}}{\mathbf{4}}{\text{,3)}} & {\text{amax(5,}}{\mathbf{6)}} \\ {\text{amax(7,}}{\mathbf{8)}} & {\text{amax(}}{\mathbf{10}}{\text{,9)}} & {\text{amax(11,}}{\mathbf{12)}} \\ {\text{amax(13,}}{\mathbf{14)}} & {\text{amax(15,}}{\mathbf{16)}} & {\text{amax(}}{\mathbf{18}}{\text{,17)}} \end{array}} \right]=\left[ {\begin{array}{*{20}{c}} 2 & 4 & 6 \\ 8 & {10} & {12} \\ {14} & {16} & {18} \end{array}} \right]$

[[ 2  4  6]
 [ 8 10 12]
 [14 16 18]]

Function amin, argmin and minimum

Function amin Find the Minimum Value Between Two Scalars, Single Vector or Two Vectors

Lets have a look at two scalars and compare them to see what one is minimum. To this we can use the function amin.

$\displaystyle {\text{s1}}={\mathbf{1}}$

$\displaystyle {\text{s2}}=2$

s1=1
s2=2
smin=np.amin([s1,s2],axis=0)
print(smin)

$\displaystyle \text{smax}=\text{amin}\left[ {{\mathbf{1}},2} \right]$

For a scalar there is only a single axis, the 0 axis. Lets look at a vector:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} {\mathbf{1 }} & 2 & 3 \end{array}} \right]$

v1=[1,2,3]
mininv1=np.amin(v1,axis=0)
print(mininv1)

Or combining the top two lines into a single line:

mininv1=np.amin([1,2,3],axis=0)

We can instead compare two vectors:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & 2 & {\mathbf{3}} \end{array}} \right]$

$\displaystyle {\text{v2}}=\left[ {\begin{array}{*{20}{c}} 2 & {\mathbf{1}} & 4 \end{array}} \right]$

v1=[1,2,3]
v2=[2,1,4]
vmin=np.amin([v1,v2],axis=0)
print(vmin)

$\displaystyle \text{vmin}=\left[ {\begin{array}{*{20}{c}} {\text{amin}\left[ {{\mathbf{1}},2} \right]} & {\text{amin}\left[ {2},{\mathbf{1}} \right]} & {\text{amin}\left[ {{\mathbf{3}},4} \right]} \end{array}} \right]$

[1 1 3]

Function amin Find the Minimum in Each Row or Column of a Matrix

Lets create a matrix M where:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & {\mathbf{4}} & {\mathbf{5}} \\ 7 & {10} & {11} \\ {13} & {15} & {18} \end{array}} \right]$

M=np.array([[1,4,5],
            [7,10,11],
            [13,15,18]])

We can calculate the minimum in each column using the following. Note that we set axis=0, so we operate on each column:

Mminineachcol=np.amin(M,axis=0)
print(Mminineachcol)

[1 4 5]

We can change the axis to 1 to instead find the minimum in each row:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & 4 & 5 \\ {\mathbf{7}} & {10} & {11} \\ {\mathbf{13}} & {15} & {18} \end{array}} \right]$

Mminineachrow=np.amin(M,axis=1)
print(Mminineachrow)

[ 5 11 18]

Function argmin Find the Index of the Minimum Value of a Vector

Returning to the case of the single vector we seen before, we may be interested in knowing not only the maximum value, but also the location of the minimum value. To do this we can use the function argmin which has a similar form to amin:

$\displaystyle {\text{v1}}=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & 2 & 3 \end{array}} \right]$

v1=[1,2,3]
mininv1=np.amin(v1,axis=0)
minlocv1=np.argmin(v1,axis=0)
print(mininv1)
print(minlocv1)

1

0

The minimum value is 1, at index 0 (recall that we are using zero order indexing).

v1[0]

Function argmin Find the Index of the Minimum Values for each Column or Row in a Matrix

Lets return to matrix M where:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & {\mathbf{4}} & {\mathbf{5}} \\ 7 & {10} & {11} \\ {13} & {15} & {18} \end{array}} \right]$

We can calculate the mainimum in each column using the same line as code as before. Note that once again we set axis=0, so we operate on each column. The function argmin takes the same form as amin:

Mminineachcol=np.amin(M,axis=0)
Mminineachcolloc=np.argmin(M,axis=0)
print(Mminineachcol)
print(Mminineachcolloc)

[1 4 5]

[0 0 0]

In this case, the minimum values are all in position 0, recall once again that we were using 0 order indexing.

We can change the axis to 1 to instead find the minimum and minimum location in each row:

$\displaystyle M=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & 4 & 5 \\ {\mathbf{7}} & {10} & {11} \\ {\mathbf{13}} & {15} & {18} \end{array}} \right]$

Mminineachrow=np.amin(M,axis=1)
print(Mminineachrow)
Mminineachrowloc=np.argmin(M,axis=1)
print(Mminineachrowloc)

[1 7 13]

[0 0 0]

Function minimum Find the Minimum Value of Two Matrices

Let's create two matrices M and N where:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} & 4 & {\mathbf{5}} \\ {\mathbf{7}} & {10} & {\mathbf{11}} \\ {\mathbf{13}} & {\mathbf{15}} & {18} \end{array}} \right]$

$\displaystyle \text{N}=\left[ {\begin{array}{*{20}{c}} {\mathbf{2}} & 3 & {\mathbf{6}} \\ {\mathbf{8}} & 9 & {\mathbf{12}} \\ {\mathbf{14}} & {\mathbf{16}} & {17} \end{array}} \right]$

To create these arrays in Python, we can use:

M=np.array([[1,4,5],
            [7,10,11],
            [13,15,18]])
N=np.array([[2,3,6],
            [8,9,12],
            [14,16,17]])

To find the maximum value between the two matrices

minmatrix=np.minimum(M,N)
print(minmatrix)

$\displaystyle \text{minimum(M,N)=}\left[ {\begin{array}{*{20}{c}} {\text{amin(}}{\mathbf{1}}{\text{,2)}} & {\text{amin(4,}}{\mathbf{3}}{\text{)}} & {\text{amin(}}{\mathbf{5}}{\text{,6)}} \\ {\text{amin(}}{\mathbf{7}}{\text{,8)}} & {\text{amin(10,}}{\mathbf{9}}{\text{)}} & {\text{amin(}}{\mathbf{11}}{\text{,12)}} \\ {\text{amin(}}{\mathbf{13}}{\text{,14)}} & {\text{amin(}}{\mathbf{15}}{\text{,16)}} & {\text{amin(18}}{\mathbf{,17}}{\text{)}} \end{array}} \right]=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 5 \\ 7 & 9 & {11} \\ {13} & {15} & {17} \end{array}} \right]$

[[ 1  3  5]
 [ 7  9 11]
 [13 15 17]]

Function sum

Function sum Find the Sum of all Elements in a Vector

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 2 \end{array}} \right]$

$\displaystyle \text{vsum}=1+3+2$

v=np.array([1,3,2])
print(v)
vsum=np.sum(v)
print(vsum)

[1 3 2]

6

Function sum Find the Sum of all Elements in a Matrix or across columns or rows

Let's create a new matrix:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}} \right]$

We can create this using:

M=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Or we can instead use the functions arange and reshape. Recall arange creates a vector of integer numbers starting from 0 (zero order indexing) and we go up to but don't include the value we enter.

M=np.arange(9)
print(M)

[0 1 2 3 4 5 6 7 8]

We want to start at 1 instead of zero so we can plus 1:

M=np.arange(9)+1
print(M)

[1 2 3 4 5 6 7 8 9]

Alternatively we could use two input arguments to specify a Start and Stop:

np.arange(start=1,stop=10)
print(M)

[1 2 3 4 5 6 7 8 9]

We have a vector of 9 elements opposed to a 3 by 3 matrix. We can use the function reshape to make this matrix:

Combining arange and reshape:

M=np.reshape(np.arange(9)+1,[3,3])
print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

M=np.reshape(np.arange(9)+1,[3,3])
print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Let's look at the sum of all elements:

Msumall=np.sum(M)
print(Msumall)

$\displaystyle \text{sum}\left( \text{M} \right)=\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]} \right)$

Let's have a look at the sum of columns and the sum of rows, recall axis=0 works on columns and axis=1 works on rows:

Msumcols=np.sum(M,axis=0)
print(Msumcols)

$\displaystyle \text{sum(M,0)}=\left[ {\begin{array}{*{20}{c}} {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 1 \\ 4 \\ 7 \end{array}} \right]} \right)} & {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 2 \\ 5 \\ 8 \end{array}} \right]} \right)} & {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 3 \\ 6 \\ 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{sum(M,0)}=\left[ {\begin{array}{*{20}{c}} {1+4+7} & {2+5+8} & {3+6+9} \end{array}} \right]$

$\displaystyle \text{sum(M,0)}=\left[ {\begin{array}{*{20}{c}} {12} & {15} & {18} \end{array}} \right]$

[12 15 18]

Msumrows=np.sum(M,axis=1)
print(Msumrows)

$\displaystyle \text{sum(M,1)}=\left[ {\begin{array}{*{20}{c}} {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \end{array}} \right]} \right)} \\ {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 4 & 5 & 6 \end{array}} \right]} \right)} \\ {\text{sum}\left( {\left[ {\begin{array}{*{20}{c}} 7 & 8 & 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{sum(M,1)}=\left[ {\begin{array}{*{20}{c}} {1+2+3} \\ {4+5+6} \\ {7+8+9} \end{array}} \right]$

$\displaystyle \text{sum(M,1)}=\left[ {\begin{array}{*{20}{c}} 6 \\ {15} \\ {24} \end{array}} \right]$

[ 6 15 24]

Function prod

Function prod Find the Product of all Elements in a Vector

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 2 \end{array}} \right]$

$\displaystyle \text{vprod}=1\times 3\times 2$

v=np.array([1,3,2])
print(v)
vprod=np.prod(v)
print(vprod)

[1 3 2]

6

Function prod Find the Product of all Elements in a Matrix or across columns or rows

print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

The function prod has a similar form to the sum function except it calculates the product of all elements opposed to the sum of all elements (1×2×3 opposed to 1+2+3).

Let's look at the product of all elements:

Mprodall=np.prod(M)
print(Mprodall)

$\displaystyle \text{prod}\left( \text{M} \right)=\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]} \right)$

Let's have a look at the product of columns and the product of rows, recall axis=0 works on columns and axis=1 works on rows:

Mprodcols=np.prod(M,axis=0)
print(Mprodcols)

$\displaystyle \text{prod(M,0)}=\left[ {\begin{array}{*{20}{c}} {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 1 \\ 4 \\ 7 \end{array}} \right]} \right)} & {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 2 \\ 5 \\ 8 \end{array}} \right]} \right)} & {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 3 \\ 6 \\ 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{prod(M,0)}=\left[ {\begin{array}{*{20}{c}} {1+4+7} & {2+5+8} & {3+6+9} \end{array}} \right]$

$\displaystyle \text{prod(M,0)}=\left[ {\begin{array}{*{20}{c}} {12} & {15} & {18} \end{array}} \right]$

[ 28  80 162]

Mprodrows=np.prod(M,axis=1)
print(Mprodrows)

$\displaystyle \text{prod(M,1)}=\left[ {\begin{array}{*{20}{c}} {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \end{array}} \right]} \right)} \\ {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 4 & 5 & 6 \end{array}} \right]} \right)} \\ {\text{prod}\left( {\left[ {\begin{array}{*{20}{c}} 7 & 8 & 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{prod(M,1)}=\left[ {\begin{array}{*{20}{c}} {1+2+3} \\ {4+5+6} \\ {7+8+9} \end{array}} \right]$

$\displaystyle \text{prod(M,1)}=\left[ {\begin{array}{*{20}{c}} 6 \\ {120} \\ {504} \end{array}} \right]$

[  6 120 504]

Function mean

Function mean Find the Mean or Average of all Elements in a Vector

The mean or average of all values in a vector is defined as:

$\displaystyle {{x}_{{mean}}}={{\mu }_{x}}=\frac{{\sum\limits_{{i=0}}^{m}{x}}}{m}$

Recall that we are using 0 indexing so in the sum we start at element 0 and go up in integer steps to m but don't include m. For the vector with three elements, this becomes:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{B} \\ \text{C} \end{array}} \right]$

The mean is:

$\displaystyle \text{vmean}=\frac{{\text{A}+\text{B}+\text{C}}}{3}$

Let's create the simple vector:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 2 \end{array}} \right]$

And find it's mean:

v=np.array([1,3,2])
print(v)
vmean=np.mean(v)
print(vmean)

$\displaystyle \text{vmean}=\frac{{\text{1}+\text{3}+\text{2}}}{3}$

[1 3 2]

2.0

Function mean Find the Mean or Average of all Elements in a Matrix or across columns or rows

Let's continue with the matrix M:

print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Let's look at the mean of all elements. The mean is the sum of all elements divided by the number of elements.

Msumall=np.sum(M)
print(Msumall)

$\displaystyle \text{mean}\left( \text{M} \right)=\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]} \right)$

5.0

Recall that the sum before was 45 and there are 9 elements. 45 divided by 9 gives the value 5. Let's have a look at the mean of columns and the mean of rows, recall axis=0 works on columns and axis=1 works on rows:

Msumcols=np.sum(M,axis=0)
print(Msumcols)

$\displaystyle \text{mean(M,0)}=\left[ {\begin{array}{*{20}{c}} {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 1 \\ 4 \\ 7 \end{array}} \right]} \right)} & {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 2 \\ 5 \\ 8 \end{array}} \right]} \right)} & {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 3 \\ 6 \\ 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{mean(M,0)}=\left[ {\begin{array}{*{20}{c}} {1+4+7} & {2+5+8} & {3+6+9} \end{array}} \right]$

$\displaystyle \text{mean(M,0)}=\left[ {\begin{array}{*{20}{c}} {4} & {5} & {6} \end{array}} \right]$

[4. 5. 6.]

Once again this is the same as the sum divided by the number of elements, in this case each column has 3 elements.

Mmeanrows=np.mean(M,axis=1)
print(Mmeanrows)

$\displaystyle \text{mean(M,1)}=\left[ {\begin{array}{*{20}{c}} {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \end{array}} \right]} \right)} \\ {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 4 & 5 & 6 \end{array}} \right]} \right)} \\ {\text{mean}\left( {\left[ {\begin{array}{*{20}{c}} 7 & 8 & 9 \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{mean(M,1)}=\left[ {\begin{array}{*{20}{c}} {1+2+3} \\ {4+5+6} \\ {7+8+9} \end{array}} \right]$

$\displaystyle \text{mean(M,1)}=\left[ {\begin{array}{*{20}{c}} 2 \\ {5} \\ {8} \end{array}} \right]$

[2 5 8]

Once again this is the same as the sum divided by the number of elements, in this case each row has 3 elements.

There are other metrics we can look at such as the median and the mode. First lets look at the limitations with using the mean matrix M:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 7 & 1 & 4 & 3 \\ 2 & 9 & {\mathbf{100}} & 8 \\ {10} & 5 & 6 & {11} \end{array}} \right]$

M=np.array([[7,1,4,3],[2,9,100,8],[10,5,6,11]])
print(M)

[[  7   1   4   3]
 [  2   9 100   8]
 [ 10   5   6  11]]

This data contains a clear outlier, let's look at the mean of the the matrix and then each column and row using:

Mmeanall=np.mean(M)
print(Mmeanall)
Mmeancols=np.mean(M,axis=0)
print(Mmeancols)
Mmeanrows=np.mean(M,axis=1)
print(Mmeanrows)

13.833333333333334

[ 6.33333333  5.         36.66666667  7.33333333]

[ 3.75 29.75  8.  ]

This gives a number that is substantially larger than all the usual elements due to the presence of the outlier. The 2nd element of the mean of all columns and 1st element of all rows are also clear outliers influenced by this outlier.

Function median

Function mean Find the Mean or Average of all Elements in a Vector

To calculate the median, in a vector, we sort out all the elements in a vector from the lowest to the highest value and then take the middle number. For instance:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} \text{C} \\ \text{A} \\ \text{B} \end{array}} \right]$

$\displaystyle \text{vsorted}=\left[ {\begin{array}{*{20}{c}} \text{A} \\ \mathbf{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{vmedian}=\text{B}$

If there are an even number of elements for instance:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} \text{A} \\ \mathbf{B} \\ \mathbf{C} \\ \text{D} \end{array}} \right]$

Then the median is the middle value of the the two middle values:

$\displaystyle \text{v} & \text{median}=\frac{{\text{B}+\text{C}}}{2}$

Let's return to the vector:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 3 \\ 1 \\ 2 \end{array}} \right]$

Let's sort it out lowest to highest and then calculate its median:

v=np.array([1,3,2])
print(v)
vsorted=np.sort(v)
print(vsorted)
vmedian=np.median(v)
print(vmedian)

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} \text{1} \\ \text{3} \\ \text{2} \end{array}} \right]$

$\displaystyle \text{vsort}=\left[ {\begin{array}{*{20}{c}} 1 \\ \mathbf{2} \\ 3 \end{array}} \right]$

$\displaystyle \text{vmedian}=2$

[1 3 2]

[1 2 3]

2.0

Function median Find the Median of all Elements in a Matrix or across columns or rows

Let's us look at this M again. Instead of looking at the mean, we will take the median, that is the middle number:

print(M)

[[  7   1   4   3]
 [  2   9 100   8]
 [ 10   5   6  11]]

Let's look at the median of all elements. The median simply lists all elements in order and takes the middle value or the mean of the two middle values if there are an even number of values. To have a look at how this works, we can flatten the matrix and then sort the flattened matrix look at all the elements in order:

Mflattened=np.ndarray.flatten(M)
print(Mflattened)
Msortedall=np.sort(Mflattened)
print(Msortedall)

[  7   1   4   3   2   9 100   8  10   5   6  11]

[  1   2   3   4   5   6   7   8   9  10  11 100]

$\displaystyle \text{Mflattened}\left( \text{M} \right)=\left( {\left[ {\begin{array}{*{20}{c}} 7 & 1 & 4 & 3 & 2 & 9 & {100} & 8 & {10} & 5 & 6 & {11} \end{array}} \right]} \right)$

$\displaystyle \text{Msorted}\left( \text{Mflattened} \right)=\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & {\mathbf{6}} & {\mathbf{7}} & 8 & 9 & {10} & {11} & {100} \end{array}} \right]} \right)$

Here the middle values are 6 and 7, the mean of 6 and 7 is 6.5.

Mmedianall=np.median(M)
print(Mmedianall)

6.5

As we can see this value is far more representative of this dataset than the value 13.8333 which was much more heavily influenced by the outlier.

Let's have a look at the median of columns and the median of rows, recall axis=0 works on columns and axis=1 works on rows:

Msortedcols=np.sort(M,0)
print(Msortedcols)
Mmediancols=np.median(M,axis=0)
print(Mmediancols)

$\displaystyle {\text{M}}=\left[ {\begin{array}{*{20}{c}} 7 & 1 & 4 & 3 \\ 2 & 9 & {100} & 8 \\ {10} & 5 & 6 & {11} \end{array}} \right]$

$\displaystyle {\text{Msortedcols}}=\left[ {\begin{array}{*{20}{c}} 2 & 1 & 4 & 3 \\ {\mathbf{7}} & {\mathbf{5}} & {\mathbf{6}} & {\mathbf{8}} \\ {10} & 9 & {100} & {11} \end{array}} \right]$

$\displaystyle \text{median}\left( {\text{M,0}} \right)=\left[ {\begin{array}{*{20}{c}} 7 & 5 & 6 & 8 \end{array}} \right]$

[[  2   1   4   3]
 [  7   5   6   8]
 [ 10   9 100  11]]

[7. 5. 6. 8.]

Once again the 2nd element 6, is far more representative of the data set than value of 36.6667 which was highly influenced by the outlier.

Msortedrows=np.sort(M,1)
print(Msortedrows)
Mmedianrows=np.median(M,axis=1)
print(Mmedianrows)

$\displaystyle {\text{M}}=\left[ {\begin{array}{*{20}{c}} 7 & 1 & 4 & 3 \\ 2 & 9 & {100} & 8 \\ {10} & 5 & 6 & {11} \end{array}} \right]$

$\displaystyle {\text{Msortedrows}}=\left[ {\begin{array}{*{20}{c}} 1 & {\mathbf{3}} & {\mathbf{4}} & 7 \\ 2 & {\mathbf{8}} & {\mathbf{9}} & {100} \\ 5 & {\mathbf{6}} & {\mathbf{10}} & {11} \end{array}} \right]$

$\displaystyle \text{Mmedianrows}\left( {\text{M,1}} \right)=\left[ {\begin{array}{*{20}{c}} {3.5} \\ {8.5} \\ 8 \end{array}} \right]$

[[  1   3   4   7]
 [  2   8   9 100]
 [  5   6  10  11]]

[3.5 8.5 8.]

Function mode

Function mode Find the Mode of all Elements in a Vector

The mode is the most frequently occurring value in a data set of discrete values.

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{A} \\ \text{B} \\ \text{A} \\ \text{A} \\ \text{C} \\ \text{A} \\ \text{A} \\ \text{B} \end{array}} \right]$

$\displaystyle \text{vsort}=\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{A} \\ \text{A} \\ \text{A} \\ \text{A} \\ \text{A} \\ \text{B} \\ \text{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{vsort}=\left[ {\begin{array}{*{20}{c}} \mathbf{A} \\ \mathbf{A} \\ \mathbf{A} \\ \mathbf{A} \\ \mathbf{A} \\ \mathbf{A} \\ \text{B} \\ \text{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{vmode}=\text{A}$

Let's create the vector:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 1 \\ 1 \\ 2 \\ 1 \\ 1 \\ 3 \\ 1 \\ 1 \\ 1 \end{array}} \right]$

Let's look at the mode of the vector, note the mode although commonly used, for some reason is not included in NumPy so we have to import SciPy and load it from stats:

v=np.array([1,1,2,1,1,3,1,1,1])
print(v)
vsort=np.sort(v)
print(vsort)
vmode=sp.stats.mode(v)
print(vmode)

[1 1 2 1 1 3 1 1 1]

[1 1 1 1 1 1 1 2 3]

ModeResult(mode=array([1]), count=array([7]))

Function mode Find the Mode of all Elements in a Matrix or across columns or rows

Let's us look at another matrix M of discrete values:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 9 & 9 \\ 1 & 9 & {14} & 9 \\ {11} & 9 & {14} & {14} \end{array}} \right]$

M=np.array([[1,2,9,9],[1,9,14,9],[11,9,14,14]])
print(M)

[[ 1  2  9  9]
 [ 1  9 14  9]
 [11  9 14 14]]

Let's look at the mode. The mode simply lists all elements in order and takes the most frequent value. Clothe sizes are in discrete sizes, if a fashion designer wants to test how well a certain design sells, they are likely to first make it available in the mode size.

Let's look at the mode of all elements:

Mflattened=np.ndarray.flatten(M)
print(Mflattened)
Msortedall=np.sort(Mflattened)
print(Msortedall)
Mmodeall=sp.stats.mode(Mflattened)
print(Mmodeall)

$\displaystyle \text{Mflattened}\left( \text{Mflattened} \right)=\text{Mflattened}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 2 & 9 & 9 & 1 & 9 & {14} & 9 & {11} & 9 & {14} & {14} \end{array}} \right]} \right)$

$\displaystyle \text{Msorted}\left( \text{Msorted} \right)=\text{Msorted}\left( {\left[ {\begin{array}{*{20}{c}} 1 & 1 & 2 & {\mathbf{9}} & {\mathbf{9}} & {\mathbf{9}} & {\mathbf{9}} & {\mathbf{9}} & {11} & {14} & {14} & {14} \end{array}} \right]} \right)$

[ 1  2  9  9  1  9 14  9 11  9 14 14]

[ 1  1  2  9  9  9  9  9 11 14 14 14]

ModeResult(mode=array([9]), count=array([5]))

Here in the entire dataset, the mode is 9 and it occurs 5 times.

Let's have a look at the mode of columns and the median of rows, recall axis=0 works on columns and axis=1 works on rows. In real data, this could be samples taken in different countries for instance:

Msortedcols=np.sort(M,axis=0)
print(Msortedcols)
Mmodecols=sp.stats.mode(M,axis=0)
print(Mmodecols)

$\displaystyle \text{Msortedcols(M,0)=}\left[ {\begin{array}{*{20}{c}} {\text{Msortedcols}\left( {\left[ {\begin{array}{*{20}{c}} {\mathbf{1}} \\ {\mathbf{1}} \\ {11} \end{array}} \right]} \right)} & {\text{Msortedcols}\left( {\left[ {\begin{array}{*{20}{c}} 2 \\ {\mathbf{9}} \\ {\mathbf{9}} \end{array}} \right]} \right)} & {\text{Msortedcols}\left( {\left[ {\begin{array}{*{20}{c}} 9 \\ {\mathbf{14}} \\ {\mathbf{14}} \end{array}} \right]} \right)} & {\text{Msortedcols}\left( {\left[ {\begin{array}{*{20}{c}} {\mathbf{9}} \\ {\mathbf{9}} \\ {14} \end{array}} \right]} \right)} \end{array}} \right]$

$\displaystyle \text{mode(M,0)=}\left[ {\begin{array}{*{20}{c}} 1 & 9 & {14} & 9 \end{array}} \right]$

[[ 1  2  9  9]
 [ 1  9 14  9]
 [11  9 14 14]]

ModeResult(mode=array([[ 1,  9, 14,  9]]), count=array([[2, 2, 2, 2]]))

Msortedrows=np.sort(M,axis=1)
print(Msortedrows)
Mmoderows=sp.stats.mode(M,axis=1)
print(Mmoderows)

$\displaystyle \text{Msortedrows}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & {\mathbf{9}} & {\mathbf{9}} \\ 1 & {\mathbf{9}} & {\mathbf{9}} & {14} \\ 9 & {11} & {\mathbf{14}} & {\mathbf{14}} \end{array}} \right]$

$\displaystyle \text{mode(M,1)=}\left[ {\begin{array}{*{20}{c}} 9 \\ 9 \\ {14} \end{array}} \right]$

[[ 1  2  9  9]
 [ 1  9  9 14]
 [ 9 11 14 14]]

ModeResult(mode=array([[ 9],
       [ 9],
       [14]]), count=array([[2],
       [2],
       [2]]))

Function var

Function var Find the Variance of all Elements in a Vector

The variance in Python is defined as:

$\displaystyle \text{var=}\frac{{\sum\limits_{{i=0}}^{m}{{{{{\left( {x-{{\mu }_{x}}} \right)}}^{2}}}}}}{{m}}=\frac{{\sum\limits_{{i=0}}^{m}{{{{{\left( {x-{{x}_{{\text{mean}}}}} \right)}}^{2}}}}}}{{m}}$

Let's have a look at the formula above. Firstly let's look at the numerator only. In the term in the brackets, we are looking at the difference between a value and its mean. If we take the case of the vector:

$\displaystyle \text{v}=\left[ {\begin{array}{*{20}{c}} 1 \\ 2 \\ 3 \end{array}} \right]$

In this case we can first calculate the mean:

$\displaystyle \text{vmean}=\frac{{1+2+3}}{3}=2$

The mean value is 2, the difference between 1 and 2 is -1, the difference between 2 and 2 is 0 and the difference between 3 and 2 is 1. Clearly if we sum these together however we get 0, which says the difference between our average data and the mean is zero, as expected because the definition of the mean is the value of our average data. To get a metric of the difference between values and a mean, we want a value that is always positive irregardless if it is above or below the mean. For this reason, we instead take the sum of the square of the difference terms which will always return a positive value. This gives us:

$\displaystyle {{\left( {-1} \right)}^{2}}+{{\left( 0 \right)}^{2}}+{{\left( 1 \right)}^{2}}=2$

In other words as some points are above the mean and some points are below the mean, all difference terms are squared to give a positive number indicating a difference. We can calculate the variance of all elements of the matrix or using axis=0,1 to look at the variance of each column or row respectively.

Next clearly, as the number of points increases, this value be definition would also increase, we need a normalisation factor which compensates for this. By default, this variance factor is merely the number of points which gives us the equation below:

In Python to get this we would type:

v=np.array([1,2,3])
print(v)
vvar=np.var(v)
print(vvar)

$\displaystyle \frac{{{{{\left( {-1} \right)}}^{2}}+{{{\left( 0 \right)}}^{2}}+{{{\left( 1 \right)}}^{2}}}}{3}=\frac{2}{3}$

[1 2 3]
0.6666666666666666

In the above, the number of degrees of freedom or numerator was set to equal the number of points. This means for data with only a single point the variance is 0. In such data the mean of a single value equals the value of the single value. Therefore their difference and square of their difference is 0. 0 divided by 1 is zero, meaning a single point has no variance. However in reality if we only measure a data point once, we cannot be sure that it is the correct value or an outlier. For this reason the denominator is set to the number of points minus 1. In python this is done by the addition of an additional input argument delta degrees of freedom (ddof) which is set to 1 opposed to the default value of 0. If we take the case of a single data point now, the numerator is 0 but the denominator becomes 1-1 which is also 0 and the variance 0 divided by 0 is undefined, which is more realistic of a sample size of 1 (opposed to saying it perfectly matches the mean and has no variance). In general the number of degrees of freedom is set equal to the number of points minus 1. Of course for a large number of points the difference between n and n-1 becomes subtle.

If we think of the number of points as nodes on a rope. We are holding the first node in our hand, it is our reference point (height of our hand). All other nodes on the string are free to vary relative to the height of our hand. For a rope with 1 node, there is no variation, the only node is at the height of our hand. For a rope with 2 nodes, the first node is fixed to the height of our hand but the second point can vary, it has one degree of freedom which is the number of nodes 2, minus 1. Likewise if the rope has 3 nodes, the node we are holding in our hand is fixed and the remaining 2 are free which is the number of nodes 3, minus the fixed point 1.

$\displaystyle \text{var=}\frac{{\sum\limits_{{i=0}}^{m}{{{{{\left( {x-{{\mu }_{x}}} \right)}}^{2}}}}}}{{m-1}}=\frac{{\sum\limits_{{i=0}}^{m}{{{{{\left( {x-{{x}_{{\text{mean}}}}} \right)}}^{2}}}}}}{{m-1}}$

The variance can be calculated with n-1 degrees of freedom by using an additional input argument delta degrees of freedom ddof. If we set it to 1, it will give n-1 opposed to the default value of 0.

v=np.array([1,2,3])
print(v)
vvar=np.var(v,ddof=1)
print(vvar)

$\displaystyle \frac{{{{{\left( {-1} \right)}}^{2}}+{{{\left( 0 \right)}}^{2}}+{{{\left( 1 \right)}}^{2}}}}{{3-1}}=\frac{2}{2}=1$

[1 2 3]
1.0

Function var Find the Variance of all Elements in a Matrix or across columns or rows

Lets look at the matrix:

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}} \right]$

M=np.reshape(np.arange(start=1,stop=10),[3,3])
print(M)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}} \right]$

M=np.reshape(np.arange(start=1,stop=10),[3,3])
print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Previously we calculated its mean of all elements, columns and rows respectively:

Mmeanall=np.mean(M)
print(Mmeanall)
Mmeancols=np.mean(M,axis=0)
print(Mmeancols)
Mmeanrows=np.mean(M,axis=1)
print(Mmeanrows)

5.0

[4. 5. 6.]

[2. 5. 8.]

We can calculate the variance of all elements using:

Mvarall=np.var(M,ddof=1)
print(Mvarall)

$\displaystyle \text{var}(\text{M})=\frac{{{{{\left( {1-5} \right)}}^{2}}}}{9-1}+\frac{{{{{\left( {2-5} \right)}}^{2}}}}{9-1}+\frac{{{{{\left( {3-5} \right)}}^{2}}}}{9-1}+$

$\displaystyle \frac{{{{{\left( {4-5} \right)}}^{2}}}}{{9-1}}+\frac{{{{{\left( {5-5} \right)}}^{2}}}}{{9-1}}+\frac{{{{{\left( {6-5} \right)}}^{2}}}}{{9-1}}+$

$\displaystyle \frac{{{{{\left( {7-5} \right)}}^{2}}}}{{9-1}}+\frac{{{{{\left( {8-5} \right)}}^{2}}}}{{9-1}}+\frac{{{{{\left( {9-5} \right)}}^{2}}}}{{9-1}}$

$\displaystyle \text{var(M)}=\frac{{{{{\left( {-4} \right)}}^{2}}+{{{\left( {-3} \right)}}^{2}}+{{{\left( {-2} \right)}}^{2}}+{{{\left( {-1} \right)}}^{2}}+{{{\left( 0 \right)}}^{2}}+{{{\left( 1 \right)}}^{2}}+{{{\left( 2 \right)}}^{2}}+{{{\left( 3 \right)}}^{2}}+{{{\left( 4 \right)}}^{2}}}}{9-1}$

Because the negative square of a scalar is equal to the positive square of a scalar, this can be simplified as:

$\displaystyle \text{var(M)}=\frac{{2\left[ {{{{\left( 1 \right)}}^{2}}+{{{\left( 2 \right)}}^{2}}+{{{\left( 3 \right)}}^{2}}+{{{\left( 4 \right)}}^{2}}} \right]}}{9-1}$

$\displaystyle \text{var}\left( \text{M} \right)=\frac{{60}}{8}$

7.5

We can calculate the variance of the columns using:

Mvarcols=np.var(M,axis=0,ddof=1)
print(Mvarcols)

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{{{{{\left( {1-4} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {2-5} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {3-6} \right)}}^{2}}}}{3-2}} \\ {\frac{{{{{\left( {4-4} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {5-5} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {6-6} \right)}}^{2}}}}{3-2}} \\ {\frac{{{{{\left( {7-4} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {8-5} \right)}}^{2}}}}{3-2}} & {\frac{{{{{\left( {9-6} \right)}}^{2}}}}{3-2}} \end{array}} \right]$

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{{{{{\left( {-3} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {-3} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {-3} \right)}}^{2}}}}{3-1}} \\ 0 & 0 & 0 \\ {\frac{{{{{\left( 3 \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( 3 \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( 3 \right)}}^{2}}}}{3-1}} \end{array}} \right]$

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{{9+9}}{3-1}} & {\frac{{9+9}}{3-1}} & {\frac{{9+9}}{3-1}} \end{array}} \right]$

[6. 6. 6.]

Mvarrows=np.var(M,axis=1,ddof=1)
print(Mvarrows)

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{{{{{\left( {1-2} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {2-2} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {3-2} \right)}}^{2}}}}{3-1}} \\ {\frac{{{{{\left( {4-5} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {5-5} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {6-5} \right)}}^{2}}}}{3-1}} \\ {\frac{{{{{\left( {7-8} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {8-8} \right)}}^{2}}}}{3-1}} & {\frac{{{{{\left( {9-8} \right)}}^{2}}}}{3-1}} \end{array}} \right]$

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{{{{{\left( {-1} \right)}}^{2}}}}{3-1}} & 0 & {\frac{{{{{\left( 1 \right)}}^{2}}}}{3-1}} \\ {\frac{{{{{\left( 1 \right)}}^{2}}}}{3-1}} & 0 & {\frac{{{{{\left( 1 \right)}}^{2}}}}{3-1}} \\ {\frac{{{{{\left( {-1} \right)}}^{2}}}}{3-1}} & 0 & {\frac{{{{{\left( 1 \right)}}^{2}}}}{3-1}} \end{array}} \right]$

$\displaystyle \text{var}\left( {\text{M},1} \right)=\left[ {\begin{array}{*{20}{c}} {\frac{2}{3-1}} \\ {\frac{2}{3-1}} \\ {\frac{2}{3-1}} \end{array}} \right]$

[1. 1. 1.]

Function std

Function std – Find the Standard Deviation of all Elements in a Vector

Recall that the formula for the variance (ddof=1) is equal to:

$\displaystyle \text{var=}\frac{{\sum\limits_{{i=1}}^{m}{{{{{\left( {x-{{\mu }_{x}}} \right)}}^{2}}}}}}{{m-1}}=\frac{{\sum\limits_{{i=1}}^{m}{{{{{\left( {x-{{x}_{{\text{mean}}}}} \right)}}^{2}}}}}}{{m-1}}$

The data is squared so positive and negative points don't cancel each other out. For instance the points 1,2,3 have a mean of 2. 1 is 1 unit below the mean and 3 is 1 unit above the mean. Ignoring the denominator and looking at the numerator, summing -1, 0 and 1 would give 0 stating no variance which is incorrect as the value of 1 and 3 each vary by 2 by 1 unit but they vary in opposite directions. For this reason we tend to square the numerator so we always have a metric to describe the variance with zero variance only occurring when all data points are identical to the mean. As a consequence of this squaring we have lost both direction but more importantly units. The units thus do not relate to the mean, median or mode on linear terms but are rather on square terms with respect to these values. As a result, the square root of the variance is commonly used alongside the mean, median or mode which has the same dimensionality. This is called the standard deviation. In Python the standard deviation function has a very similar form to the variance and the same input arguments. The standard deviation with a Delta Degrees of Freedom of 1 has the form:

$\displaystyle \text{std}=\sqrt{{\frac{{{{{\sum\limits_{{i=0}}^{m}{{\left( {x-{{\mu }_{x}}} \right)}}}}^{2}}}}{{m-1}}}}=\sqrt{{\frac{{{{{\sum\limits_{{i=0}}^{m}{{\left( {x-{{x}_{{\text{mean}}}}} \right)}}}}^{2}}}}{{m-1}}}}$

v=np.array([1,3,5])
print(v)
vmean=np.mean(v)
print(vmean)
vvar=np.var(v,ddof=1)
print(vvar)
vstd=np.std(v,ddof=1)
print(vstd)

$\displaystyle \text{vmean}=\frac{{1+3+5}}{3}=3$

$\displaystyle \text{vvar}=\frac{{{{{\left( {1-3} \right)}}^{2}}+{{{\left( {3-3} \right)}}^{2}}+{{{\left( {5-3} \right)}}^{2}}}}{{3-1}}=\frac{{{{{\left( {-2} \right)}}^{2}}+{{{\left( 0 \right)}}^{2}}+{{{\left( 2 \right)}}^{2}}}}{2}=\frac{8}{2}=4$

$\displaystyle \text{vstd}=\sqrt{{\frac{{{{{\left( {1-3} \right)}}^{2}}+{{{\left( {3-3} \right)}}^{2}}+{{{\left( {5-3} \right)}}^{2}}}}{{3-1}}}}=\sqrt{{\frac{{{{{\left( {-2} \right)}}^{2}}+{{{\left( 0 \right)}}^{2}}+{{{\left( 2 \right)}}^{2}}}}{2}}}=\sqrt{{\frac{8}{2}}}=\sqrt{4}=2$

[1 3 5]

3.0

4.0

2.0

Function std – Find the Standard Deviation of all Elements in a Matrix or across columns or rows

Mstdall=np.std(M,ddof=1)
print(Mstdall)
Mstdcols=np.std(M,axis=0,ddof=1)
print(Mstdcols)
Mstdrows=np.std(M,axis=1,ddof=1)
print(Mstdrows)

2.7386127875258306

[3. 3. 3.]

[1. 1. 1.]

Function cumsum

Function cumsum – Find the Cumulative Sum of all Elements in a Vector

The cumulative sum of a vector takes the form:

$\displaystyle \text{V=}\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{cumsum}\left( \text{V} \right)=\left[ {\begin{array}{*{20}{c}} \text{A} \\ {\text{A+B}} \\ {\text{A+B+C}} \end{array}} \right]$

v=np.arange(start=1,stop=4)
print(v)
vcumsum=np.cumsum(v)
print(vcumsum)

[1 2 3]

[1 3 6]

Function cumsum – Find the Cumulative Sum of all Elements in a Matrix or across columns or rows

M=np.reshape(np.arange(start=1,stop=10),[3,3])
print(M)
Mflattened=np.ndarray.flatten(M)
print(Mflattened)
Mcumsum=np.cumsum(M)
print(Mcumsum)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumsum}\left( \text{M} \right)=\left[ {\begin{array}{*{20}{c}} 1 & \begin{array}{l}1\\+\\2\end{array} & \begin{array}{l}1\\+\\2\\+\\3\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\\+\\5\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\\+\\5\\+\\6\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\\+\\5\\+\\6\\+\\7\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\\+\\5\\+\\6\\+\\7\\+\\8\end{array} & \begin{array}{l}1\\+\\2\\+\\3\\+\\4\\+\\5\\+\\6\\+\\7\\+\\8\\+\\9\end{array} \end{array}} \right]$

$\displaystyle \text{cumsum}\left( \text{M} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 6 & {10} & {15} & {21} & {28} & {36} & {45} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[1 2 3 4 5 6 7 8 9]

[ 1  3  6 10 15 21 28 36 45]

print(M)
Mcumsumcols=np.cumsum(M,axis=0)
print(Mcumsumcols)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,0}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ {1+4} & {2+5} & {3+6} \\ {1+4+7} & {2+5+8} & {3+6+9} \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,0}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 5 & 7 & 9 \\ {12} & {15} & {18} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[ 1  2  3]
 [ 5  7  9]
 [12 15 18]]

print(M)
Mcumsumrows=np.cumsum(M,axis=1)
print(Mcumsumrows)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,1}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & {1+2} & {1+2+3} \\ 4 & {4+5} & {4+5+6} \\ 7 & {7+8} & {7+8+9} \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,1}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 3 & 6 \\ 4 & 9 & {15} \\ 7 & {15} & {24} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[ 1  3  6]
 [ 4  9 15]
 [ 7 15 24]]

Function cumprod

Function cumprod – Find the Cumulative Product of all Elements in a Vector

The cumulative product of a vector takes the form:

$\displaystyle \text{V=}\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{cumprod}(\text{V})=\left[ {\begin{array}{*{20}{c}} \text{A} \\ {\text{A}\times \text{B}} \\ {\text{A}\times \text{B}\times \text{C}} \end{array}} \right]$

v=np.arange(start=1,stop=4)
print(v)
vcumprod=np.cumprod(v)
print(vcumprod)

[1 2 3]

[1 2 6]

Function cumprod – Find the Cumulative Product of all Elements in a Matrix or across columns or rows

M=np.reshape(np.arange(start=1,stop=10),[3,3])
print(M)
Mflattened=np.ndarray.flatten(M)
print(Mflattened)
Mcumprod=np.cumprod(M)
print(Mcumprod)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumprod}\left( \text{M} \right)=\left[ {\begin{array}{*{20}{c}} 1 & \begin{array}{l}1\\\times \\2\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\\\times \\5\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\\\times \\5\\\times \\6\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\\\times \\5\\\times \\6\\\times \\7\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\\\times \\5\\\times \\6\\\times \\7\\\times \\8\end{array} & \begin{array}{l}1\\\times \\2\\\times \\3\\\times \\4\\\times \\5\\\times \\6\\\times \\7\\\times \\8\\\times \\9\end{array} \end{array}} \right]$

$\displaystyle \text{cumprod}\left( \text{M} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 6 & {24} & {120} & {720} & {5040} & {40320} & {362880} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[1 2 3 4 5 6 7 8 9]

[     1      2      6     24    120    720   5040  40320 362880]

print(M)
Mcumprodcols=np.cumprod(M,axis=0)
print(Mcumprodcols)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumprod}\left( {\text{M,0}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ {1\times 4} & {2\times 5} & {3\times 6} \\ {1\times 4\times 7} & {2\times 5\times 8} & {3\times 6\times 9} \end{array}} \right]$

$\displaystyle \text{cumprod}\left( {\text{M,0}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & {10} & {18} \\ {28} & {80} & {162} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[  1   2   3]
 [  4  10  18]
 [ 28  80 162]]

print(M)
Mcumprodrows=np.cumprod(M,axis=1)
print(Mcumprodrows)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,1}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & {1\times 2} & {1\times 2\times 3} \\ 4 & {4\times 5} & {4\times 5\times 6} \\ 7 & {7\times 8} & {7\times 8\times 9} \end{array}} \right]$

$\displaystyle \text{cumsum}\left( {\text{M,1}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 6 \\ 4 & {20} & {120} \\ 7 & {56} & {504} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[  1   2   6]
 [  4  20 120]
 [  7  56 504]]

Function diff

Function diff – Find the Difference of all Elements in a Vector

The diff of a vector takes the following form. Note because we are taking the difference between two points for each element in our new vector, we have one point less than we started:

$\displaystyle \text{V=}\left[ {\begin{array}{*{20}{c}} \text{A} \\ \text{B} \\ \text{C} \end{array}} \right]$

$\displaystyle \text{diff}\left( \text{M} \right)=\left[ {\begin{array}{*{20}{c}} {\text{B}-\text{A}} \\ {\text{C}-\text{B}} \end{array}} \right]$

v=np.arange(start=1,stop=4)
print(v)
vdiff=np.diff(v)
print(vdiff)

[1 2 3]

[1 1]

Function diff – Find the Difference of all Elements in a Matrix or across columns or rows

M=np.reshape(np.arange(start=1,stop=10),[3,3])
print(M)
Mflattened=np.ndarray.flatten(M)
print(Mflattened)
Mdiff=np.diff(Mflattened)
print(Mdiff)

$\displaystyle \text{Mflattened}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{diff}\left( {\text{Mflattened}} \right)=\left[ {\begin{array}{*{20}{c}} {2-1} & {3-2} & {4-3} & {5-4} & {6-5} & {7-6} & {8-7} & {9-8} \end{array}} \right]$

$\displaystyle \text{diff}\left( {\text{Mflattened}} \right)=\left[ {\begin{array}{*{20}{c}} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{array}} \right]$

[1 2 3 4 5 6 7 8 9]

[1 1 1 1 1 1 1 1]

print(M)
Mdiffcols=np.diff(M,axis=0)
print(Mdiffcols)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{diff}\left( {\text{M,0}} \right)\text{=}\left[ {\begin{array}{*{20}{c}} {4-1} & {5-2} & {6-3} \\ {7-4} & {8-5} & {9-6} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[3 3 3]
 [3 3 3]]

print(M)
Mdiffrows=np.diff(M,axis=1)
print(Mdiffrows)

$\displaystyle \text{M}=\left[ {\begin{array}{*{20}{c}} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}} \right]$

$\displaystyle \text{diff}\left( {\text{M,1}} \right)\text{=}\left[ {\begin{array}{*{20}{c}} {2-1} & {3-2} \\ {5-4} & {6-5} \\ {8-7} & {9-8} \end{array}} \right]$

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[1 1]
 [1 1]
 [1 1]]

Perquisites

Function sort and argsort

Function sort and argsort – sort values in a Vector or find the argument indexes of lowest to highest values

Function sort and argsort Sort or find the argument indexes of lowest to highest values for either all Elements in a Matrix or across columns or rows

Function amax, argmax and maximum

Find the Maximum Value Between Two Scalars, Single Vector or Two Vectors

Function amax Find the Maximum in Each Row or Column of a Matrix

Function argmax Find the Index of the Maximum Value of a Vector

Function argmax Find the Index of the Maximum Values for each Column or Row in a Matrix

Function maximum Find the Maximum Value of Two Matrices

Function amin, argmin and minimum

Function amin Find the Minimum Value Between Two Scalars, Single Vector or Two Vectors

Function amin Find the Minimum in Each Row or Column of a Matrix

Function argmin Find the Index of the Minimum Value of a Vector

Function argmin Find the Index of the Minimum Values for each Column or Row in a Matrix

Function minimum Find the Minimum Value of Two Matrices

Function sum

Function sum Find the Sum of all Elements in a Vector

Function sum Find the Sum of all Elements in a Matrix or across columns or rows

Function prod

Function prod Find the Product of all Elements in a Vector

Function prod Find the Product of all Elements in a Matrix or across columns or rows

Function mean

Function mean Find the Mean or Average of all Elements in a Vector

Function mean Find the Mean or Average of all Elements in a Matrix or across columns or rows

Function median

Function mean Find the Mean or Average of all Elements in a Vector

Function median Find the Median of all Elements in a Matrix or across columns or rows

Function mode

Function mode Find the Mode of all Elements in a Vector

Function mode Find the Mode of all Elements in a Matrix or across columns or rows

Function var

Function var Find the Variance of all Elements in a Vector

Function var Find the Variance of all Elements in a Matrix or across columns or rows

Function std

Function std – Find the Standard Deviation of all Elements in a Vector

Function std – Find the Standard Deviation of all Elements in a Matrix or across columns or rows

Function cumsum

Function cumsum – Find the Cumulative Sum of all Elements in a Vector

Function cumsum – Find the Cumulative Sum of all Elements in a Matrix or across columns or rows

Function cumprod

Function cumprod – Find the Cumulative Product of all Elements in a Vector

Function cumprod – Find the Cumulative Product of all Elements in a Matrix or across columns or rows

Function diff

Function diff – Find the Difference of all Elements in a Vector

Function diff – Find the Difference of all Elements in a Matrix or across columns or rows

Share this:

Like this:

Leave a ReplyCancel reply