Python and MatPlotLib: Introduction to Image Processing

Example Picture

In this example, I will use the following picture. You can use either the same picture, if you are working through this guide or you can replace it with your own one. Save this picture by right clicking it and selecting save as… and call it LondonPNG.png this picture should be in the same folder as your Python Script.


We require the following libraries


We will also close all existing plots:


Loading a Picture as float32 array using the function imread

The function imread found in the matplotlib.image library can be used to read in a picture from a file.


A Picture as a 3D Array

The picture otherwise known as a float32 3D array can be found in the variable explorer.

Opening it up, we can see the data as viewed via axis0.

Note at the bottom we are viewing the array using axis 0 and the shape is 720×960×4. This is the rows×columns×pages with rows selected. In other words we are looking only at row0. In this view (axis=0) the data shown is for each of the 960 columns of row0. Each column is listed in this view as a row which has 4 values (the four pages). The four pages actually correspond to the colours. Recall that all colours are made up of three primary colours red, green and blue; these are the 0th, 1st and 2nd pages (or 0th, 1st and 2nd column in the view of axis=0). The four page (page 3 – recall we use 0 order indexing) is the alpha channel representing the transparency (in this case the image is not transparent so every value in the 4th page is 1).

Let's look at the first row that we see on the view where axis=0. These set of four values correspond to the colour of the 0th row and 0th column otherwise known as pixel 0,0. We can index this value using:

[0.23921569 0.46666667 0.7882353  1.        ]

This is the 0th row, as viewed from axis0.

Recall that Python displays colours as floats normalised between 0 and 1 whereas Microsoft Office displays them as values from 0 to 255. We can get the values Microsoft Office would specify by using:

r= 61.0
g= 119.0
b= 201.0

We can take these into the colour picker in Microsoft Office and we get the colour of the sky (as expected with this image).

Resolution of the Image

We can use the function np.shape to look at the dimensions of the array. To get the number of pixels we multiply the rows and columns together. In computer science we divide by 1024 to get kilopixels and by 1024 again to get megapixels.


rows= 720
cols= 960
pages= 4
resolution= 0.66 Mpixels

Viewing the 3D Array by Other Axes

In the variable explorer we can change the axes to view the array from. Switching from axis 0 to axis 1 will set the view to put the rows as the rows and the colours as the columns:

And to axis 2 will set the rows to display as the rows and the columns to display as the columns. In other words it will show page 0 as the numeric float of the red channel:

Page 1 as the numeric float of the green channel:

And page 2 as the numeric float of the green channel.

In such a small regime of the picture, 14 rows and 3 columns there is very little change. Here is the image, recall this is 14/720 rows and 4/960 columns displaying in the variable editor. So it will all be within a tiny patch of sky blue within the top left corner of the image.

Viewing the Picture


We can select a point on the picture using the mouse cursor. Let us go to the White Ensign and select different colours.

Let's once again index and select the colours from:

row=185, col=571

[0.5176471  0.17254902 0.21960784 1.        ]
r= 132.0
g= 44.0
b= 56.0

row=203, col=605

[0.47843137 0.5019608  0.5647059  1.        ]
r= 122.0
g= 128.0
b= 144.0

row=167 col=617

[0.         0.07058824 0.23921569 1.        ]
r= 0.0
g= 18.0
b= 61.0

row=304, col=102

Turning Axes On or Off


Plotting a Grid

A grid can be added to the figure, just like it would be any ordinary figure.


Compressing and Saving

Recalling that the image consists of 720 rows×960 columns×4 pages. To reduce the file size we can take every nth row and nth column. For example if we wanted to reduce the file size by 4.


Let's compare this with the earlier figure 3…

img1 resolution= 0.66 Mpixels

As can be seen, the number of pixel numbers in both the x and y axes have halved.

img2 resolution= 0.16 Mpixels

The file sizes in Windows Explorer is seen to about quarter (as expected with a compression ratio of 4):

Note the difference between line 15 and line 21, line 15 saves the compressed 3D array to a new png file whereas line 21 saves figure 4 which contains the image as a file.

We can repeat this, compressing by 16 fold, 64 fold, 256 fold and 1024 fold. For convenience this will be done using a for loop

img3_resolution=0.65918 Mpixels
img4_resolution=0.164795 Mpixels
img5_resolution=0.041199 Mpixels
img6_resolution=0.0103 Mpixels
img7_resolution=0.002575 Mpixels
img8_resolution=0.000658 Mpixels

Here you can see the compression ruin the quality of the picture. The file size in Windows Explorer:

Rotating an Image

We can use the function rotate from mpimg to rotate the image by for example 45 degrees (line 13).


The rotated image has a large proportion of white space. It is also possible to crop the image by selecting a sub selection. For instance pixels 400-800,400-800


Splitting Image into Primary RGBA Channels

Now let's look at img1 and get the dimensions (line 12). Let's select each individual page, page0 is the red floats per pixel, page1 is the green floats per pixel, page2 is the blue floats per pixel and page3 is the alpha or transparency floats per pixel. One can index into img1 by selecting all rows and all columns and the page number 0,1,2 and 3 to individual variables r,g,b and alpha respectively (line 14-17). The functions zeros can be used with the array dimensions (line 12) to create an empty array (line 19). Page 3 of this empty array can be assigned to the alpha values (line 20). The empty array can be copied to make a new array r2 (line 22) which can be modified by adding only the red values to pageo of the empty array (line 23). This r2 now only contains the red values (page0) and the alpha values (page3). The green values (page1) and blue values (page2) are left as zeros. The procedure can be repeated with the green and blue channels. These can be plotted as a subplots alongside img1.


Looking at Secondary Channels

Now that we have looked at the image in terms of its primary colours we can also look at it in terms of its secondary colours. Recall the secondary colours are made up of two primary colours:

Secondary ColourPrimary Colour 1Primary Colour 2

We can then plot the primary colours alongside the secondary colours using gridspec to orientate the subplots with the secondary colours around those of the primary colours.


Enhancing or Reducing One of the Channels

Now that we understand the principles behind an image file i.e. a 3D array which has rows, columns and pages where the 0th page is a matrix of numeric floats for the red channel, the 1st page is a matrix of numeric floats for the green channel, the 2nd page is a matrix of numeric floats for the blue channel and the 3rd page is a matrix of numeric floats for the red channel and knowing that these floats are normalised between 0 and we can numerically perform some basic image editing.

Let us create a 2x red filter where we double the intensity of the red channel (line 31-33) and then set any floats greater than the maximum value of 1 to equal 1 (line 35). We can then create a new image (line 37-39) and plot these as subplots.


Now let's rewrite this as a for loop and look at the influence of a 2x, 4x, 8x, 0.5x and 0.1x filter.


As we can see the first redx2 channel is enhanced and the picture has a red tinge.

With the 4x red filter, we see that many of the pixels in the red channel get saturated and the picture has a stronger red tinge.

With the 8x red filter, the red channel becomes ardto resolve as most of it is saturated, once again the red tinge is stronger.

With the 0.5x red filter, we see the intensity in the red channel is a lot lower. This gives a cyan tinge as blue and green combine to make cyan.

With a 0.1x red filter, we see the intensity in the red channel is very low and is very hard to resolve above the background. The image is dominated by the blue and green channels and has a stronger cyan tinge.

As practice you can experiment with the other two channels and perhaps create a different custom filter for the three channels.

Converting to Greyscale

The data shown so far has been coloured data which has four channels (rgba). Grey scale data on the other hand only has only a single channel. The data can be collapsed into a single channel using the average value of the three channels:


The variable greyscale created is a matrix of 720 rows by 960 columns. When plotted however a colourmap is applied by default giving it 'false colour'.

It was calculated using ratios of 1/3 for each channel however to compensate for our eye being more sensitive to green for instance than red or blue respectively, a compensation factor may be applied (line 19).


The default colourmap is viridis, we can add a colorbar to see how the colorbar corresponds to the values of each pixel.


Colourmaps for Greyscale Data

We can change this to other colourmaps, for example bone, jet and hot which are commonly used with grey scale data.


Enhancing Greyscale Data with Colourmap

Once again we can attempt to enhance the brightness using a multiplication factor and cut off any data which exceeds 1.


Adding Noise

This image was taken in daylight and as a consequence a large number of photons were available leading to a very good signal to noise ratio. In many scientific applications for instance microscopy, the signal to noise ratio may be poorer and we may mimic this case by applying a level of random noise to our image.


We can modify the code and look at the use of a for loop to look at the image when the signal is 1/4, 1/8, 1/16 and 1/32 of the original value and compare this with the original.


As we artificially decrease the intensity of the image with respect to the noise, we can see that the image becomes harder and harder to view due to the poor signal to noise ratio:

Adding a Bright Pixel

Let's now take one of our noisy images, figure 28 and find the maximum value of pixel and then introduce a pixel of maximum brightness. This emulates for instance an over-sensitive pixel in a camera.


As you can see this new image is now very hard to see…

Applying Filters to Improve Image Quality

We have this very poor quality image, can we make it any better?

Let's try an upper threshold:


This has removed the highest artificial noisy pixel and has restored some image quality.

Let's try using a lower, upper threshold:


This has removed the lowest artificial noisy pixel and has restored some image quality.

We can try adding in a lower threshold also.


Now we can attempt to multiply the data by a constant factor and once again threshold any value out higher than 1.


Image processing by carrying out only shareholding is quite limited. We can instead use other functions to work on the data, such as a Median filter


Or Gaussian filter of width 3 which will fit a Gaussian to the original data using 3 by 3 data points and update the image accordingly.



Let's return to the original image and look at inverting it. We know that the maximum value of each pixel in each channel is 1. We can invert the data by taking the current data away from 1.