This page is obsolete, I wrote it while I was just starting to learn python. I have wrote a much better set of notes on pandas available here:
Table of contents
In this guide we will look at creating a bar graph of UK rain fall in 2016, 2017, 2018 and 2019. We will then look at creating a stacked bar graph and a grouped bar graph.
Data to be plotted can be created as individual vectors, a matrix or dataframe. This guide will look at individual vectors primarily, where each data column being corresponds to a separate vector however for plotting it is worthwhile also revising indexing of a column within a matrix to select a column to plot and also selecting a column from a dataframe.
The data below is the rainfall per month, since 2019 isn't complete yet, the second half of the year are mainly 0 values.
This can be combined to a single multiple column variable.
This shows up in the variable explorer like:
When it is maximised it looks like:
However note that the column and row indexes have little meaning.
It is more insightful to put it together as a dataframe.
This shows up in variable explorer as:
When maximised it looks like:
Creating a Numeric x-axis
We will create a numerical x-axis and use this to plot the mm_Rain_2016 per month. To do this we will use the arange function within the NumPy library:
[ 0 1 2 3 4 5 6 7 8 9 10 11]
Numeric Months can also be added as an additional column in the matrix or dataframe.
To add it to the front of the matrix, we must convert it to a column vector.
Then concatenate it:
Plotting a Bar Graph
To plot a bar graph, we can use the function bar from the MatPlotLib library and as inputs we are going to add the numeric_months for x values, mm_Rain_2016 for y values and we are going to add the label='2016' as this distinguishes this data set from the rest (this label will be discussed later).
In some of the other Python IDEs we would perhaps also need to use:
In order to show the graph.
If we use the matrix rainfalldata, we would instead index all rows and the 0th column for the x data and all rows and the 1st column for the y data respectively.
If we use the dataframe we would index using the column names:
To change the x ticks from numeric (0 to 11 in steps of 1, 0 indexing) to the names of the months we can use the xticks function from the MatPlotLib library:
We can then change the xlabel, ylabel, title using the xlabel, ylabel and title from the MatPlotLib library with the string input arguments being the label we want for our bar graph.We can also display the title using the legend function from the MatPlotLib library alongside the loc input argument which we will specify using the string 'upper right'). I discussed these in more detail when I mentioned Histogram (hist) plotting.
Now I will add gridlines to this chart:
This dataset doesn't have an error but if we assume a measurement error of 5 we can make an error category:
[5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]
The plot can be made with the following modifications:
Line 4 the close function from the MatPlotLib is used with the input string 'all' to close all figures efore running this code, this means new figures will be created opposed to modifying a figure already open.
The variable yerror is created on line 13, we then can modify line 16 to set the input argument yerr to equal our variable yerr
Bar Colour, Bar Outline, Bar Outline Colour and Hatching
In this case we will change the bar colours to green with spot hatching and a black outline by adding the additional input arguments color set as the desired [rgba], hatch set as the string O, edgecolor set as the string k (black) and setting the linewidth of the bars to 1.2.
The bar graph is very similar to the Histogram (hist) plot. I have already discussed these parameters in much more detail here:
Horizontal Bar Graph
A few minor changes can be made to the above to make a Horizontal Bar Graph. Line 11 and 12 are updated to xerror from y error. Line 16 barh is updated from bar and xerr is used in place of yerr. The order of the input arguments is y, x in barh opposed to x,y as found in bar. Lines 17-19 are toggled between x and y, what was x is now y and what was y is now x.
Plotting Multiple Bars
Returning back to the code for the Vertical Bars, we can now look at adding more than one dataset. We can essentially copy line 16 and paste it as line 17, then modify it for the 2017 data set and give it a different colour and hatching:
Note however that it has just plotted the blue 2017 bars on top of the 2016 green bars. This can be seen because I specified transparency when plotting.
Stacked Bar Graphs
It is possible that we will want to stack the 2017 data ontop of the 2016 data instead. This can be done by modifying line 17 to add the additional input argument bottom and specify it to have the value of the mm_Rain_2016 data:
We will now copy and paste line 17 twice and amend it for the 2018 and 2019 data, again selecting a custom colour and hatching pattern. For the mm_Rain_2018 data, it has to have at its bottom both the mm_Rain_2016 and the mm_Rain_2017 data so the function np.add is used to sum these two arrays together. Likewise for the mm_Rain_2019 data, it has to have at its bottom, the mm_Rain_2016, mm_Rain_2017 and mm_Rain_2018 data.
This gives a good example of a stacked bar graph.
Grouped Bar Graph
To do this, we are going to copy and paste the code above but remove the input argument bottom from lines 17,18,19. We are also going to comment out the changing of the xticks to test (line 20) and the line to show the legend (line 24). This done by highlighting the line and pressing [Ctrl] +  pr typing a # in front of it.
This plots the bars all on top of one another as we seen before. Note that these bars are all 1 unit in width. We want 4 bars in 1 unit so we can specify a new variable width on line 16 and set it to 1/4. In line 17, we must specify the width as the third input argument after the x and y data. In line 18, we want to move the x-data by the width of 1 bar, our scalar variable width so we can add this to the data, then once again as the third input we must also specify this new width. For line 18, we need to shift the data by the width of 2 bars, so this is 2 times our width and for the third input we must once again specify this new width. For line 19, we shift by the width of 3 bars and once again specify the width.
This gives us, our stacked bar graph. Now that we have it, we can simply uncomment out the line 21 and 25: