Getting Started with Python and the Spyder 5 IDE (Procedural Programming)



The Scientific Python Development Environment (Spyder) has a very similar user interface to the commercial product Matrix Laboratory (Matlab) which is commonly used to teach computer science fundamentals in Universities in the scientific and engineering fields. Unlike Matlab which is an expensive commercial product, Python and the Spyder Integrated Development Environment (IDE) are open-source and can be freely downloaded using the Anaconda Python Distribution. The Anaconda Python distribution contains a number of additional open source data science libraries such as Numeric Python (numpy), the Python and Data Analysis Library (pandas) and the Python plotting library (matplotlib) which can be used to replace Matlab functionality for most purposes in the scientific fields.

This particular guide is a beginner guide that will look at the inbuilt Python programming language and focus on the concept of basic procedural programming. Procedural programming takes place line by line in the order specified for example within a script file. This guide will use Spyder 5 which is one of the best Python IDEs particularly for beginners due to its simple but powerful user interface and versatile variable explorer. Understanding the inbuilt core Python programming language and procedural programming is a perquisite to using any data science libraries.

If you have not installed Spyder 5 with the Anaconda Python distribution already, please follow my installation guides:

builtins module

Python has a number of builtin objects which are as the name suggests inbuilt into Python. However it can be quite insightful to import the builtins module so we can type in builtins followed by a dot . to view the list of inbuilt objects using Kite autocompletion.

import builtins

Kite shows the most commonly objects used in the module, followed by everything else alphabetically. Of import note are:

Inbuilt classes:

  • str
  • list
  • int
  • dict
  • bool
  • float
  • tuple
  • set
  • complex
  • range
  • type

Inbuilt functions:

  • print
  • str
  • input
  • enumerate
  • max
  • min
  • mean

We can have a look at this list outputted to the console by using the dir function an abbreviation for directory to inspect the object builtins. To use the function we need to use parenthesis () to enclose the positional input arguments. Details about these will be shown in the script editor when the function dir is typed with open parenthesis:

We see that the function dir is wanting a positional input argument object. In this case we can specify the object builtins:

import builtins


The top inbuilt class is the str class which is an abbreviation for string of characters. We can use the class str to instantiate (create a new instance) of the str class. The () are used to enclose the input argument object which has a default value of two quotations (this represents an empty str).

object names and assignment operator

The initialization method has an output and we can assign this output to an object name. The object name is always on the left hand side of the assignment operator =. Object names are also in snake_case, i.e. all lower case characters and no spaces, with the underscore used to split multiple words in the object name. Numbers can be included in the object name but the object name cannot begin with a number.

To prevent confusion between an object name and a str, the text of the str is enclosed in ""

The text of the str can be in upper and lower case and include every character except for the backslash \ , double quotation " and single quotation ' which are known as escape characters and will be discussed later.

snake_case = str("hello World!")

script files and variable explorer

A script file is essentially a text file with a .py extension instead of a .txt extension. It contains Python code. To the top left (underlined in red) the file path of the currently selected script displays and the name of the script is shown on the scripts tab.

The icons also on the left (circled in blue) give the options to create a new script, open an existing script, save the script and save the script as. Then there is the run script icon (circled in orange) which we will select.

The prompt [1] in the console (bottom right) informs us that the selected script file has been ran giving us the location of the script file that was ran.

The script (left hand side) contains only a single line of Python code. The str, "hello World!" is assigned to the object snake_case which displays on the variable explorer (right hand side circled in green). Note the variable explorer displays the object name, the datatype of the object, the length of the object (this is a string of 12 characters) and the characters themselves. The quotations are not shown on the variable explorer.

the print function

We can use the print function to print the str to the console. If we type in print with open parenthesis we will view the docstring of the function. We see that the print function has variable positional input arguments (underlined in red) and a number of keyword input arguments (the keyword arguments all have a default value assigned which will be used if we do not specify otherwise).

We can set value to be the object name of our str.

snake_case = str("hello World!")

The print function prints output to the console and has no return statement. Let's see what this means by attempting to assign it to an output which we will call print_output. When we do this, we see that the datatype of print_output is a NoneType object.

str methods

If we type in str followed by a dot, Kite autocompletion will show a number of attributes that can be called from a str. Most of these are methods that carry out some manipulation of the string of characters.

The methods lower, upper and capitalize will return the str in lowercase, uppercase or capitalize every word respectively. To use a method from a class, we need to specify the positional input argument self. self is the instance of the class we want to operate on. In our case snake_case is an instance of the str class.

We can assign the output of the method to a new object_name:

snake_case = str("hello World!")
lower_case = str.lower(snake_case)
upper_case = str.upper(snake_case)
capitalized = str.capitalize(snake_case)

Instead of calling the method from the class and specifying the instance, we can call the method as a function from the instance itself. In such a scenario we don't need to specify self as it is implied.

Also we don't need to explicitly use the str class to instantiate a str but can instead directly create one using quotations directly.

greeting = "goodbye World!

Notice the list of attributes that can be accessed from the str greeting is identical to that from snake_case because both of these objects are instances of the str class and the str class contains a blueprint for these attributes. We can use upper as a function from the instance greeting. Under the hood this function is defined as a method in the class str. The following lines are therefore equivalent:

upper_case = str.upper(greeting)
upper_case = greeting.upper()

In the former, the instance self must be supplied as greeting and in the latter the instance greeting is implied. The number of input arguments required for a function are shown when the function is called with open parenthesis.

In the case of the replace function we require a substring old to replace with new being the replacement substring. If this method was called directly from the str, we would require the instance self and then the additional positional input arguments old and new. The comma , is used as a delimiter to seperate the input arguments.

Notice that no change takes place because the substring "goodbye" isn't present in "GOODBYE WORLD!" i.e. the str is case sensitive.

We can call a function within a function and split the input arguments for a function over multiple lines to prevent it from wrapping:

We can now see that this works as expected:

greeting = "goodbye World!"
upper_case = greeting.upper()
greeting_2 = upper_case.replace("goodbye".upper(), 

If we wanted to make this all upper, we could also specify this:

greeting = "goodbye World!"
upper_case = greeting.upper()
greeting_2 = upper_case.replace("goodbye".upper(), 

This works as expected:

str datamodel methods

There are a number of datamodel methods sometimes called special methods which map to an operator which carries out an operation between two instance (usually of the same class). For a str one of these operations is concatenation, that is the joining of two strings together. We can call the datamodel method explicitly using __add__. Because the datamodel methods being and end with a double underscore, they are also colloquially called dunder methods.

We can get a sentence from these two words by using:

sentence = word_0.__add__(word_1)

This is much more commonly done by using the + operator.

sentence = word_0 + word_1

Notice the str sentence has no spaces. These have to be manually specified.

sentence = word_0 + " " + word_1

escape characters

Notice what happens when a str with a quote is typed, we get a syntax error. The syntax color coding thinks we have ended the str and her quotation is taken to be 3 different object names which are not defined.

To rectify this we can instead enclose the str of characters with single quotations.

In general double quotations are more commonly used to enclose a string of characters as the apostrophe is more common than a quotation.

However in some cases we will need to use both.

To do this we need to use the special symbol \ to insert an escape character.

This now shows up as expected on the variable explorer.

The \ can be used to insert some other escape characters \t is used to represent a tab and \n is used to represent a new line. For example:

greeting = "\thello \nWorld!"

If we return to the print statement we can see that the keyword input arguments sep, separates values printed by a space and end, ends the print statement in a new line.

We can see how this works by printing the words "hello" and "world" three times and changing the sep and end keyword arguments in the middle print statement:

print("hello", "world")
print("hello", "world", sep="\t", end= "")
print("hello", "world")

Now sometimes we will want a str of the file path, for example the file path of the script file: C:\Users\Phili\

The file path includes the character \ which Python recognises as an instruction to insert an escape character. To get a \ we must use a \\ an instruction to insert an escape character that is itself \.

file_path = "C:\\Users\\Phili\\"

relative strings

We can create a relative string or a rstring by prepending a string by r. The relative string will automatically update every \ to a \\ allowing one to rapidly copy a file path from Windows Explorer to Python.

file_path = r"C:\Users\Phili\"

formatted strings

We have seen that it can be quite cumbersome to concatenate strings together to make a sentence. There is an easier way to do this called a formatted str. The formatted string is prepended with f and { } can be inserted within the str in order to place an object within the str.

name = "Philip"
weather = "sunny"
print(f"Hello {name}, it is {weather} today.")

Note currently Spyder 5.0.3 currently doesn't apply the correct coloring syntax for a formatted str so I have adjusted the screenshot.

the input function

It is quite common to use the input to gather a str of characters from a user using the input function. The user will be given a prompt in the console to input characters into the console and the output will be stored as a variable.

We can ask the user for their name and store it as the object name user_name.

user_name = input(prompt = "What is your name?")
print(f"Hello {user_name}")

Note that in the following form, there is no space after the question mark in the prompt so it looks a bit funny in the console.

However the code works as expected and the greeting displays:

We can update this to the form including a colon and a space at the end.

user_name = input(prompt = "Input your name: ")
print(f"Hello {user_name}")

We can specify all the keyword input arguments in the function as positional input arguments (there is only one prompt).

user_name = input("Input your name: ")
print(f"Hello {user_name}")
This image has an empty alt attribute; its file name is spyder142-1024x537.png

When positional input arguments, all input arguments must be input in the same order as the docstring. If the function has a mixture of positional input arguments and keyword input arguments and you specify all the input arguments in order listed in the docstring, you can drop the keywords i.e. in this case prompt =.

If however the function has multiple keyword input arguments as seen with the print function for example and you only want to use a subset of the keyword input arguments (i.e. want the unspecified keyword input arguments to use their default value), you must specify the names of the keyword input arguments you wish to use to avoid confusion.


We can use the str method split, to split a string of characters by use of a seperator. sep is a keyword input argument which has a default value of None meaning it will by default split the string of characters by usage of the space as a seperator.

We can use it to split our sentence "hello world" into the individual words "hello" and "world".

word_0 = "hello"
word_1 = "world"
sentence = word_0 + " " + word_1
words = sentence.split()

Notice that the output of this method is however a new object type known as a list. The list displays in the variable explorer and can be double clicked so it can be examined in more detail. A list can be thought of as a collection of objects, and in this case each object is a string of characters.

A list can also be created explicitly using the list class which takes in an iterable positional input argument. For a list square brackets are used to enclose the list [ ] and a comma is used as a delimiter to split each object (in this case each object is a string of characters).

For example we can make a list of the string of characters "goodbye" and "world":

words = list(["goodbye", "world"])

We don't need to explicitly use the list class to instantiate a list but can instead directly create one using square brackets and comma delimiters directly.

words = ["hello", "and", "goodbye", "world"]

In some cases where each str in the list is long, it may be more convenient to split the list over multiple lines by use of the delimiter.

words = ["hello", 

list methods

A number of attributes can be called from the list instance words. The list is said to be mutable and most of the list methods will mutate the list instance self directly. For example the method extend can be used to extend a list (for example with another list) and the method append can be used to append an additional item to the end of the list.

Let's create two lists to demonstrate these methods.

# %% Create lists
list_0 = ["hello", "world"]
list_1 = ["be", "happy"]
# %% Extend list_0 with list_1
new_list = list_0.extend(list_1)

In Spyder beside the Run Script button, there is a button for running a cell. Highlighting this button gives instructions to create a new cell using the syntax # %%. In Python the # is recognised as a comment and not executed as code. The text displays as grey indicating that it is a comment. A separator will appear at the the top and bottom of each cell and the currently selected cell will be highlighted in yellow. Beside the run current cell button is the run current cell and move onto the next cell. Beside that button is the run highlighted selection button.

Selecting run cell and move onto the next cell results in the objects list_0 and list_1 being created and displaying on the variable explorer.

Notice that the second cell is now highlighted as we used the run cell and move onto the next cell button. When we run the 2nd cell, we can see that new_list is a NoneType datatype. This is because the extend method has no return statement, recall that we seen a similar behaviour when we attempted to assign the print function to an output. Instead of a return value, we can see that list_0 is now directly extended and now includes the two additional strings of characters that were in list_1. This method is said to directly mutate list_0.

Now let's use the method append opposed to extend. Like extend, this method has no output and will directly mutate list_0 so there is no point in assigning the output to an object name.

# %% Create lists
list_0 = ["hello", "world"]
list_1 = ["be", "happy"]
# %% Append list_0 with list_1

Let's restart the kernel and rerun the 1st cell to restore the original lists.

Now let's run the 1st cell to append list_1 to list_0. We see that the mutated list_0 now includes list_1 nested as an additional last element. We can open up list_0 in the variable explorer to view its contents level by level.


Let's add an additional cell and use the reverse method on list_1. This method will reverse the order of the values in list_1.

# %% Reverse list_1

We can see that list_1 is reversed as expected. However we can also see, perhaps unexpectedly when first encountering this datatype that list_0 has also been mutated with the last index reflecting the changes made to list_1.

Sometimes this mutability behaviour is useful as an update to list_1 automatically will update list_0. Other times this mutability behaviour is not desired and one needs to be cautious. We can add another cell and use the copy method to make a copy of a list which we can work on without changing the original.

# %% Copy list_0
list_0_copy = list_0.copy()

Then we can create another cell and use the clear method for example to example clear all the items in list_0_copy without mutating list_0.

# %% Clear list_0.copy()

Care should be taken with mutability. If we update the code to clear list_1 instead of list_0.copy in the last line and restart the kernel then rerun the entire script:

# %% Create lists
list_0 = ["hello", "world"]
list_1 = ["be", "happy"]
# %% Extend list_0 with list_1
# %% Reverse list_1
# %% Copy list_0
list_0_copy = list_0.copy()
# %% Clear list_1()

We see that although list_0_copy is independent of list_0 both these lists are still dependent on list_1 and therefore a mutation in list_1 (clearing it in this case) mutates both list_0 and list_0_copy:

Like the str, the list has the data model method __add__ assigned and the + operator can be used to perform concatenation. Concatenation will return an output opposed to mutating the instance self:

new_list = list.__add__(list_0, list_1)
new_list = list_0.__add__(list_1)
new_list = list_0 + list_1

We can concatenate in a manner that replicates the extend method and append methods respectively:

# %% Create lists
list_0 = ["hello", "world"]
list_1 = ["be", "happy"]
# %% Concatenate list_0 with list_1 (similar to extend)
new_list_extended = list_0 + list_1
# %% Concatenate list 0 with list 1 at last index (similar to append)
new_list_appended = list_0 + [list_1]
# %% Clear list_0

Restarting the Kernel and running the script gives list_0, list_1 and the new_list_extended and new_list_appended all appear as expected. Note the final line which clears list_0 does not mutate new_list_extended or new_list_appended. For more details you can optionally copy and paste the code into Spyder to run cell by cell.

nested lists

To create a list with a nested list within it directly we can use. Sometimes this is split over multiple lines for the sake of clarity.

greetings = [["hello", "goodbye"], ["world", "planet earth"]]
greetings2 = [["hello", "goodbye"], 
              ["world", "planet earth"]]

We can open one of the lists in the variable explorer (both lists are the same) for more clarity. We see that each element in the list is a list itself.


If we examine the list of lists in the variable explorer we can see the size is 2. 2 is the number of items in the outer list.

We can use the function len to return the len of an object:

In this case we can add a new cell to calculate the len of the list greetings (outer list dimension):

size_greetings_outer = len(greetings)

Running this cell creates the object size_greetings_outer which displays on the variable explorer. We can see that it is the datatype int which is an abbreviation for an integer and has a value 2.

indexing a collection using integers

Integers are whole numbers and are widely used in Python particularly for counting objects such as the length of elements in a list and also selecting objects in a collection using the objects integer index:

If we open up, greetings for example within the variable explorer we can see that we have an integer index to the left hand side.

Note that the integer index starts with 0 and goes up in integer steps of 1. This is known as zero order indexing. In zero order indexing we are inclusive of the lower bound and exclusive of the upper bound. The upper bound of the list is the length of the list which we have already seen as 2. In the index we count from 0 to 2 (upper bound) in steps of 1, recalling that we are exclusive of 2 (upper bound) giving us index 0 and index 1.

We can use square brackets [] to index into a collection. A list is a collection of values and each value has an index.


Where idx is an int.

Let's select index 0 from greeting. For convenience we will type this command directly into the console without assigning it to an object name, doing so will print the value to the console opposed to showing a new object within the variable explorer:


We can use len once again to calculate the length of the inner list at index 0:


We can index once again into this list. Note the outer list index is selected and then the inner index is selected each using separate square brackets i.e.


Let's for example select the first inner index:


Note that this returns the string of characters "goodbye":

A str i.e. a string of characters is also a collection of characters and therefore has a length.


The len of the str "goodbye" is 7 i.e. there are 7 characters in the str. Each character in a str also has an index. We can have a look at the last character by selecting index 6. Recalling we count from 0 (inclusive of lower bound) to 7 (exclusive of upper bound) in steps of 1, meaning we reach 6. We can select the letter "e" using index 6 of this str with:


The number before 0 is -1, as a consequence we can also index the last object in a collection using the -1 index. We can also index each other object in a collection using it's negative index.

To get the letter "e" again, we can use:


We can index multiple objects in a collection by use of two semicolons. The format is of the form:


If only 1 colon is specified, step is assumed to be 1:


If the lower bound is not specified it is assumed to be 0:


If the upper bound is not specified, it is assumed to be the length of the list:


Be careful of the upper bound as recall that we are exclusive of the upperbound.

If neither the lower or upper bound are selected this also makes a copy of the list:


We can step through the entire collection:


For example to get the word "bye" from the doubly nested str "goodbye", we can use:


And to get "planet earth" in reverse we can use:


The list methods insert and pop are reliant on an index being supplied.

We can for example opt to use the method insert on the instance list_0 to insert at index 1, the object list_1. The word "world" which was at index 1 will now be at index 2.

# %% Create lists
list_0 = ["hello", "world"]
list_1 = ["be", "happy"]
# %% insert
list_0.insert(1, list_1)

Once again any changes made to list_0 will mutate list_1. For example:

# %% reverse list_1

The method pop has a single input argument index which has a default value of -1 (the last index). The method pop both has a return statement and mutates the instance the method is called from.

In a new cell let's pop index 0 and assign it to the output popped_value:

# %% pop
popped_value = list_0.pop(0)

popped_value displays in the variable explorer and list_0 is mutated as expected:

The remove method does not use the index, instead it requires the user to specify a value.

Note that the value in the nested list is not accepted:

We have to select this list directly by indexing it and using the remove function:

# %% remove

When we run this cell notice, that list_0 is mutated as expected however because list_0 and list_1 are linked list_1 is also mutated:


Let's have a look at creating an instance of the int class. We can do this explicitly with the int class or we can just assign a variable to an integer value directly. Viewing the docstring, we can also see that the positional input x can also be a str of an integer number. This means we can change or cast the object type of a str of an integer to an int:

The keyword argument base is by default set to decimal i.e. a base of 10 however it can be used to convert a number which has a different base time such as a binary str (base 2) or hexadecimal str (base 16) which are widely used throughout computer to a base 10 int.

num1 = int(2)
num2 = 2
num3 = int("2")

We have used the __add__ method for the str and list classes and discussed how it maps to the + operation. For an int, the + operator does not carry out concatenation but rather it carries out addition. Compare the results of:

2 + 2
"2" + "2"

The int class performs numeric addition and the str class performs concatenation. If we wanted to perform concatenation of two numeric numbers we would need to temporarily cast them as str, concatenate them and then recast them as an int. To do this we can use the str and int classes respectively.

When using the assignment operator, the right hand side is carried out first and the result is assigned to the object name. This allows one to use a class or function on an existing object and then to reassign the new value (right hand side) to the existing object name (left hand side).

# %% create two ints
num1 = 1
num2 = 2
# %% Cast to str
num1 = str(num1)
num2 = str(num2)
# %% Perform str concatenation
num3 = num1 + num2
# %% Cast to int
num3 = int(num3)

Running the 1st cell gives num1 and num2 as ints as seen on the variable explorer:

Running cell 2 casts them to str:

Running cell 3 performs the str concatenation:

Running cell 4 casts the num3 to an int:

This problem is encountered quite regularly when one uses the input statement:

# %% get two ints from the user
num1 = input(prompt="First number: ")
num2 = input(prompt="Second number: ")
# %% add two ints
num3 = num1 + num2
# %% print my result
print(f"The addition of your two numbers is {num3}.")

Let's run the 1st cell:

Now the second:

Now the third:

When running cell 2, the input function always returns a str. To perform numeric addition, these str need to be cast into int. Once this is done, the code works as intended. Once again you can copy and paste this code into Spyder and run cell by cell to explore the datatypes as the scrip is being executed:

# %% get two ints from the user
num1 = input(prompt="First number: ")
num2 = input(prompt="Second number: ")
# %% cast str to int
num1 = int(num1)
num2 = int(num2)
# %% add two ints
num3 = num1 + num2
# %% print my result
print(f"The addition of your two numbers is {num3}.")

numeric operators and order of operations

There are a number of numeric operations we can carry out with two integer numbers which in turn result in an integer number.

Now in Python we use a slightly different syntax, for example the = is not the equals sign but is used instead as an assignment operator. We have seen the use of the __add__ datamodel method and discussed how it maps to the + key. We also use convenient keys for the above operations such as ** for exponential, * for multiplication, // and % for floor division and modulo (remainder) respectively, + for addition and – for subtraction.

Let's carry out the same operations we sketched out in the equation editor:

# %% Create two ints
num1 = 3
num2 = 2
# %% Numeric Operations
pow_2 = num1 ** num2
mul = num1 * num2
floordiv = num1 // num2
mod_2 = num1 % num2
add = num1 + num2
dif = num1 - num2

We get the same values as expected:

The operations have been listed by precedence, a multiplication operation will occur before an addition for example. The order of operations can however be changed by use of parethesis:

This behaviour is mimicked in Python:

# %% Order of Operations
num1 = 2 + 3 * 2
num2 = (2 + 3) * 2

reassignment operators

For a numeric operator it is quite common to reassign a value. For example we can instantiate num1 in the 1st cell.

# %% Instantiate num1
num1 = 2
# %% Increase num1 by 1
num1 = num1 + 1

Then we can increase it by a step of 1 in the second cell.

Note the line of code in the second cell carries out the numeric operation on the right hand side of the cell using the original value of num1 first. Once this numeric operation is completed, it is then reassigned to the object num1 following the instructions of the assignment operator to the left hand side:

# %% Increase num1 by 1
num1 = num1 + 1

Because reassignment is routinely used in Python, there are reassignment operators. Essentially we append the = sign to the normal operator i.e. **= for exponential, *= for multiplication, //= and %= for floor division and modulo (remainder) respectively, += for addition and -= for subtraction reassignment.

# %% Instantiate num1
num1 = 2
# %% Increase num1 by 1
num1 += 1

Restarting the kernel and then rerunning each cell gives the same behaviour as the reassignment of num1 earlier:

using an int for collection replication

Some operators will work between an instance of the int class and an instance of a collection. We can multiply a collection by an int to replicate it. This is commonly done quickly to print a str which includes a series of underscores in the console for example:

30 * "_"

Or we can make more complicated patterns:

10 * "|-|"

We can also use it to replicate items in lists:

5 * ["placeholder"]

3 * ["odd_ph", "even_ph"]


The next class of interest is the dict, an abbreviation for dictionary. Dictionaries are used to store key:value pairs. They are used routinely to map a hard to remember configuration such as a hexadecimal color code value for example "#FF0000" to a key that is easy to remember for example "red".

The key and the value are laid out in the form of a key and definition i.e. like a traditional dictionary and where the dict gets its name from. In a dictionary all keys have to be unique although some keys may have identical values.

In a hexadecimal code 2 digits correspond to a red LED, 2 digits correspond to a green LED and 2 digits respond to a blue LED. Our eyes physiological response has 3 different types of detectors sensitive to these 3 wavelengths and our brain carries out color-mixing from the intensity ratios making up every other color we perceive. Screens use arrays of these LEDs and the intensity values are set in software programs to make each part of the screen an appropriate color. In Hexadecimal we use 16 characters 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E and F and the value FF corresponds to the 16*16 value which equals the 256th value (but we subtract 1 away as we start counting from 0 giving 255). This means the corresponding LED is on full brightness. We can put together the hex codes of the primary and secondary colors as well as black and white using their full letters as keys and 1 letter abbreviations as additional keys.

colors = {"red" : "#FF0000",
          "r" : "#FF0000",
          "green" : "#00FF00",
          "g" : "#00FF00",
          "blue" : "#0000FF",
          "b" : "#0000FF",
          "cyan" : "#00FFFF",
          "c" : "#00FFFF",
          "yellow" : "#FFFF00",
          "y" : "#FFFF00",
          "magenta" : "#FF00FF",
          "m" : "#FF00FF",
          "black" : "#000000",
          "k" : "#000000",
          "white" : "#FFFFFF",
          "w" : "#FFFFFF"}

Dictionaries are commonly used throughout Python. Color dictionaries like the above are routinely used for color selection in Python plotting libraries particularly matplotlib.

matplotlib line plot made using the colors 'r', 'g', 'b', 'c', 'y', 'm', 'k'

The pandas library which at this point can be considered as a library for working with spreadsheets often uses data within a dict to create a dataframe (which can be conceptualized as a spreadsheet) dataset. Each key will become a column of the dataframe and each value is an equally sized list which will correspond to the data in each column.

healthy_food = {"food" : ["apples", "bananas", "grapes"],
                "qty" : [1, 2, 30],
                "cal" : [116, 89, 62]}
A pandas dataframe made using the dict above

pandas also uses dict for operations such as renaming, in such a case the keys are the original column names and are mapped to the values which are the new column names.

rename_cols ={"food" : "fruit",
              "qty" : "quantity",
              "cal": "calories"}
pandas dataframe columns renamed using a dict

An input argument in a function to change settings for example in a tkinter code. tkinter is GUI library may also expect one of the positional input arguments to be a dict where the dict keys are proscribed to the name of each setting to be changed and the value being the user value of each prospective setting. Appropriate key:value pairs can be used to change the text size, font size, text color, background color and so on.

indexing using keys

The dict is a collection and can be indexed using square brackets. There is no numeric index but rather a key index usually of str values. We can index into the dict using the str "k", the 1 letter abbreviation for "black" to get the hexadecimal value of the color black.


The value "#000000" is returned as expected.

dict methods

The dictionary has a number of methods which can be accessed by typing the dict name followed by a dot . unsurprisingly many of these methods are similar to the methods found in the list collection and behave in a similar manner as both the list and dict are collections.

We can also index into a dictionary using using the get method.


The main difference when it comes to indexing with the get method opposed to use of square brackets is when a key doesn't exist. Indexing into a dict using square brackets with a key that doesn't exist will result in a KeyError and interrupt your script. When using the get method and no default value is set, a blank value is returned.

The method setdefault is similar to get. It takes two positional input arguments, the first is the key whose value is to be looked up (if the key is present) and the second is a default value which is returned if the key is absent:

colors.setdefault("c", "#000000")
colors.setdefault("q", "#000000")

In the first case the value correspond to the key "c" is returned which is "#00FFFF". In the second case, the key "q" does not exist so the default value is returned:

A new key value pair can be added by indexing into the dictionary using a key that doesn't exist and assigning it to a new value. For example if we wanted to include the key "grey" and assign it the value "#C1C1C1":

colors["grey"] = "#C1C1C1"

We can obtain list like items for the keys and values respectively using the methods keys and values respectively:

# %% Create a dict of colors
colors = {"red" : "#FF0000",
          "r" : "#FF0000",
          "green" : "#00FF00",
          "g" : "#00FF00",
          "blue" : "#0000FF",
          "b" : "#0000FF",
          "cyan" : "#00FFFF",
          "c" : "#00FFFF",
          "yellow" : "#FFFF00",
          "y" : "#FFFF00",
          "magenta" : "#FF00FF",
          "m" : "#FF00FF",
          "black" : "#000000",
          "k" : "#000000",
          "white" : "#FFFFFF",
          "w" : "#FFFFFF"}
# %% Obtain keys and values
keys = colors.keys()
values = colors.values()

Not too much details about these variable types display in the variable explorer but we can print them to the console and see that they are indeed like lists:

Then we can cast them to lists:

# %% Cast to lists
keys = list(keys)
values = list(values)

Which we can ereadily view within the variable explorer:

Now that we have two equally sized lsits, we can see how we can make a dictionary from them using the zip function:

# %% Create a zipped object
colors_zipped = zip(keys, values)

This creates a zipped object which we can't see much details about within the variable explorer:

However we can cast it back to a dictionary using:

# %% Create a dict from the zipped objected
colors2 = dict(colors_zipped)

This gives an identical dict colors2 which is the same as the original dict colors as expected:


We can use the type class to determine the datatype of an object.

Let's re-examine the concept of casting. We will create a cell which will assign the object name num1 as a str. The next cell that will determine its datatype and assign it to an output type_1_num1. The next cell will cast its datatype to an int and finally the last cell will determine its datatype and assign it to an output type_2_num2.

# %% create num1 as str
num1 = "2"
# %% determine datatype
type_1_num1 = type(num1)
# %% cast to int
num1 = int(num1)
# %% determine updated datatype
type_2_num1 = type(num1)

Let's run the first and second cell. Notice that instances of the type class do not display on the variable explorer. The main reason for this is that the variable explorer already displays the datatype of each object making the type instances redundant with respect to the variable explorer.

The instance type_1_num1 does exist and displays as str when called in the console:

When we typically use the type class we input it directly into the console with no assigned object name. This displays the output of the datatype within the console.


Once again we can see that the datatype displays within the console.


The type class is mainly used to check whether a variable is a certain datatype. We can check for equality using the is equal to == logical operator, not to be confused with object assignment using =.

type(num1) == int
type(num1) == str

We are given back the bool results False and True.

The is equal to == logical operator should not be confused with the assignment statement =, let's add a new code that includes both of these in one line. The check for equality is carried out on the right hand side and will return a bool. This bool value is then assigned to the object name is_int. Let's run this script:

# %% create num1 as str
num1 = "2"
# %% determine datatype
type_1_num1 = type(num1)
# %% cast to int
num1 = int(num1)
# %% determine updated datatype
type_2_num1 = type(num1)
# %% datatype check
is_int = type(num1) == int

We see that is_int is the type bool and has a value of True. A bool can have one of two values True or False.

Note when True and False (capitalized and no quotation) are used in the script editor they are color coded. Lower case should not be used as the expectation will be an object name. Quotations should not be used otherwise the expectation will be a str.

The bool can be instantiated explicitly using the bool class but this is not commonly done.

logical operators

For the class type we can essentially ask one of two questions. Is the object equal to the datatype which we seen uses the is equal to operator == and the other question is the object not equal to the datatype which uses the not equal to != operator.

type(num1) == int
type(num1) != int

For int datatypes which are ordered discretely, supplementary logical operators may be used to compare two values. We have the greater than >, greater than or equal to >=, less than < or less than or equal to <= operators. In the console let's see if 5 is greater than 3:

5 > 3

We get False as expected. Let's now also check to see if 5 is greater than or equal to 3:

5 >= 3

For clarity let's now check to see if 5 is greater than 5:

5 > 5

We get False as expected. Let's now check to see if 5 is greater than or equal to 5:

5 >= 5

We get True as expected.

We can combine boolean values using the and or or keywords:

True and True

Makes True.

True and False

Makes False.

False and False

Makes False.

True or True

Makes True.

True or False

Makes True.

False or False

Makes False.

When using logical operators with the and or or keywords, it is good practice to use parenthesis to enclose each condition to make the code more readable. For example we can check if 5 is greater than 3 and if 2 is less than or equal to 3:

(5 > 3) and (2 <= 3)

Since both these conditions are True we get the result True.

Multiple more conditions can be queried by building up more complicated expressions using the syntax above.

bool arithmetic

Although it may not seem to be on first glance, the bool is actually numeric. True corresponds to a value of 1 and False corresponds to a value of 0. This can be checked using:

1 == True
0 == False

Both queries are True.

Instances of the bool class therefore can be used to carry out numeric operations in an identical manner to the int datatype for example the following will give 1 * 0 = 0 and 1 + 1 = 2 respectively:

True * False
True + True


Earlier we examined int numeric datatypes and used int for tasks such as indexing a collection and getting its length. We stuck purposely to int based mathematical operations for example in the case of division we obtained the largest possible floored integer and associated integer remainder (modulo). We can also carry out float division which will always obtain a float.

Let's use a floor divide of 3 by 2. This give 1 complete division by the value of 2 and 1 remainder of modulo. i.e. 1 remainder 1.

The float divide on the other hand gives the float 1.5. float is an abbreviation for floating point number which is essentially a number that has a decimal place. note the . in this case denotes the decimal point (and is not used to select attributes from an object in this case).

# %% Integer Division
floordiv = 3 // 2
# %% Integer Remainder
mod = 3 % 2
# %% Float Division
div = 3 / 2

float precision

All the numerical operations previously demonstrated with an int can be carried out with a float. i.e. **, *, +, -, /, **=, *=, +=, -=, /= and conditional operations can also be carried out such as ==, !=, <, <=, >=, >. However there can be some nuances when carrying out such operations with a float. Compare the following 2 lines of code, in theory the multiplication operations used should be commutative and should be relatively easy to perform using a pen and paper. However when using Python they result in different answers:

3 * 3.14 * 3
3 * 3 * 3.14

We can see that both numbers are similar. In the case of the first number, we can see that we have a recurring 9…

When we count we use the characters 0,1,2,3,4,5,6,7,8 and 9 i.e. 10 values or decimal notation. Computers count using binary effectively a series of switches which each have a value of 0 or 1. Recurring values occur more frequently in binary as we only have 2 unique characters to represent a number opposed to 10. An analogy in decimal notation however is the concept of a third i.e. 0.333…

We can cast it to a str and then use the len function to find out that it is 18 characters long; 1 character corresponds to the decimal point itself so the number is represented with 17 digits.

len(str(3 * 3.14 * 3))

A computer will store a float to 17 significant figures. Any value on the 18th significant figure is effectively truncated as we don't have any more memory to store the value. This results in the rounding error seen above. In most real life applications we don't need to store a number to a precision of 17 decimal places and can round it to the desired precision.

For example if 3.14 was a cm length measured on a ruler like the one below, the precision will only be accurate to the 1st or 2nd decimal place as there only markings representing the unit (cm), the 1st decimal place (mm) and one can only roughly infer between these markings to get the 2nd decimal place.

In Python we can use the round function to round a float to a desired number of decimal places:

We see from the docstring that we can use the float under investigation as the 1st positional argument and provide an optional int second argument to specify the number of decimal places. If this 2nd optional int is not supplied, it will automatically round to an int:

round((3 * 3.14 * 3), 2)

Because of the float rounding errors, care should be taken when using comparison operators. For example:

0.1 + 0.2 == 0.3

is False. If we look at the left hand side we can see there is once again a rounding error.

Rounding both sides to a sensible number of significant figures e.g. in this case 1 will result in the comparison being True.

round((0.1 + 0.2), 1) == round(0.3, 1)

scientific notation

Physically we tend to measure objects with respect to ourselves. The imperial system for example used the inch (~ pinch), hand, foot, yard, step, pace and mile which were all made relative to the human.

The human is of course a tiny object compared to a planetary object such as a star and a massive object compared to an atomic structure such as a hydrogen atom.

Nowadays most of the world have switched over to the metric system, with the exception of the US and the UK (which uses the half imperial and half metric measurement system).

The metric system is based around the numbers 0,1,2,3,4,5,6,7,8,9 and therefore uses units of 10. For convenience units of every thousand away from the unit are typically represented with a prefix.

(with Respect to Unit)
prefixsingle letter
0 . 001-3Millim
0 . 000 001-6Microµ
0 . 000 000 001-9Nanon
0 . 000 000 000 001-12Picop

For example:

  • 696,340 kilometres – radius of the sun
  • 1.753 metres – height of a human
  • 120 picometres – radius of hydrogen atom

In Python these very large and very small numbers are represented using scientific notation. We type in the number without commas or spaces. Instead of a suffix we use e followed by the number of digits with respect to the unit. For example:

# %% measurements
r_sun = 696340e3
h_human = 1.753
r_hydrogen = 120e-12

Now the sun is actually made up of hydrogen atoms. The radius of each hydrogen atom is tiny compared to the radius of the sun. The addition or subtraction of 1 atom from the radius of the sun is so tiny compared to the error in measuring the suns radius that the suns radius is actually going to be unchanged following this calculation.

Therefore the following comparison returns True.

(r_sun - r_hydrogen) == r_sun

We can however calculate the number of hydrogen atoms required side by side to make the radius of the sun.

r_sun / r_hydrogen

We are returned a very large number (18 digits before the unit).


Let's create a very basic list:

# %% create list
list_1 = ["hello", "world"]

Note the number of attributes we have available; most of these are methods that mutate the list:

We can use the tuple class to cast a list to a tuple:

# %% create list
list_1 = ["hello", "world"]
# %% cast to tuple
tuple_1 = tuple(list_1)

Running this code block will create a tuple. Notice on the variable explorer that the two collections are very similar. The main difference that can be observed is in the brackets used. Inputting the tuples name followed by a dot . shows the list of tuple attributes. Note that the tuple methods available count and index do not mutate the tuple. i.e. a tuple is essentially a restricted list that cannot be mutated:

As a tuple cannot be mutated it takes up less memory and therefore it can be faster to access tuples which is only really important when dealing with a large data set.

A list can be nested within a tuple and vice versa. To understand mutability in more detail, let's create a code block that creates two lists list_1 and list_2. Then a code block which appends list_2 to list_1, then a code block which casts list_1 to a tuple and finally a code block that reverses list_2:

# %% create lists
list_1 = ["hello"]
list_2 = ["world", "planet earth"]
# %% append list_2 to list_1
# %% cast to a tuple
tuple_1 = tuple(list_1)
# %% mutate list_2

Running the first code block creates list_1 and list_2 as expected:

Running the next code block appends list_2 to list_1 as expected:

Casting to a tuple works as expected:

Reversing list_2 mutates list_2 and list_1 and tuple_1 as list_2 is linked to these objects. i.e. the list within the tuple can still be mutated:

We can index into a tuple in an identical manner to a list, using square brackets. If we select index1, we will get the list ["planet earth", "world"] and because this is a list, we can access the list attributes allowing us to directly mutate the list:

parenthesis (order of operations), parenthesis (function inputs) and parenthesis (tuple collection)

If we return to our color dict, we can see that the items method, returns a dict_items collection which is essentially a list of 2 tuple elements:

# %% Create a dict of colors
colors = {"red" : "#FF0000",
          "r" : "#FF0000",
          "green" : "#00FF00",
          "g" : "#00FF00",
          "blue" : "#0000FF",
          "b" : "#0000FF",
          "cyan" : "#00FFFF",
          "c" : "#00FFFF",
          "yellow" : "#FFFF00",
          "y" : "#FFFF00",
          "magenta" : "#FF00FF",
          "m" : "#FF00FF",
          "black" : "#000000",
          "k" : "#000000",
          "white" : "#FFFFFF",
          "w" : "#FFFFFF"}
# %% Obtain keys and values
keys = colors.keys()
values = colors.values()
items = colors.items()

As discussed earlier, we would be able to mutate the list but not each of the tuples within the list.

However sometimes the form of the list [ ] is preferred to the form of a tuple ( ) is preferred when using either as an input to functions, as functions use ( ) to enclose their input arguments. In the above output one can clearly see dict_items( ) with the parenthesis being used to enclose the input. The [ ] indicating the list of items and each item in the list being a tuple. This may be harder to follow if the outer list was also a tuple.

We can create a tuple directly using ( ) and a comma as a delimiter. Care should however be taken when attempting to create a tuple of a single item because the ( ) as seen earlier is used as parenthesis particularly for mathematical operations. To make a tuple of a single item we need to add a comma at the end of the single item even though we don't intend to add an additional item:

tuple_1 = ("world", "planet earth")
str_2 = ("world")
tuple_3 = ("world", )

Some Python programmers will add a , at the end of every collection (list, tuple and dict) regardless of how many items are present.

Care should be taken to differentiate parenthesis used to enclose input arguments in a function, when used to create a tuple and when used to determine the order within a calculation.

round(max((1.14, (2.14 + 3.14))), 1)

Take your time to make sure you understand what all the brackets mean in this line of code.

As you type a bracket in either the script editor or the console. Spyder will automatically highlight the bracket selected, alongside the matching bracket to assist with this.

Sometimes more lines of code are better for the sake of readability than making a concise line of code that is hard to follow.

val_1 = (2.14 + 3.14)
tuple_1 = (1.14, val_1)
max_1 = max(tuple_1)
round_max_1 = round(max_1, 1)


We have already used the most common brackets on the keyboard such as [ ], ( ) and { } on the keyboard for the list, tuple and dict respectively. Let's create an empty tuple, empty list and empty dict using the corresponding brackets:

tuple_1 = ()
list_1 = []
dict_1 = {}

set { } vs dict { }

All the collections above use the comma , as a delimiter however the dict also uses the colon : to seperate each key and value. Another inbuilt collection is the set, which can only store unique values. Let's create an empty instance of each of these using the class directly:

tuple_1 = tuple()
list_1 = list()
dict_1 = dict()
set_1 = set()

Observe that the set and the dict look the same. Both are enclosed in a set of {}. The set when it has values in it can be differentiated from a dict as no colons are used.

Let's create a set of three str values, 1 unique and 1 with a duplicate. Notice the set when displayed on the variable explorer only displays the unique values. i.e. sets cannot have duplicate values:

set_1 = {"hello", "world", "world"}

set methods

The set has a number of attributes, which can be accessed by typing the set instance name followed by a dot . most of these are set methods.

We can call the method issubset from set_2 to check if set_2 is a subset of set_1 (i.e. all values in set_2 are in set_1). Alternatively we can also call the method issuperset from set_1 to check is set_1 is a superset of set_2 (i.e. all values of set_2 are in set_1). These give a bool result.

The intersection of two sets are the set of values that exist in both sets. The difference is a set of values that exist in set_1 that are not present in set_2. The union is a set of all the values in set_1 and set_2. These three methods have a return statement which produces a new set.


The methods difference_update and intersection_update instead of returning an output will mutate the set, the methods are called from.



In mathematics, no real number can satisfy the following equation:

num = (-1) ** 0.5

imaginary number j

The datatype is complex with a real and indeterminate component. The real component here is a float and is to e-17, in other words it is effectively 0 and what we see is a rounding error to 0.

The indeterminate j is used to satisfy the condition:

1j * 1j == -1

In maths and physics the indeterminate term i is typically used and the unknown determinant is known as an "imaginary" number. In electrical engineering, j is used instead as i is reserved for the current. Python uses the same term as electrical engineers j.

Let's create a complex number, we can do so explicitly using the complex class. It has two keyword input arguments real and imag which can be used to assign the real and imag component respectively. These keyword arguments have a default value of 0:

num_1 = complex(real=1, imag=0)

Alternatively we can make a complex number by use of j:

num1 = 1 + 2j

complex numbers also have a number of attributes which can be accessed by typing in the object name of the complex instance followed by a dot . and Kite will display these as a dropdown list.

real and imaj will read off the real and imag component as attributes:


The conjugate method returns the imag complex conjugate:


Essentially it returns the complex number with the sign of the imaginary component flipped. Recall that the origin of j is from the sqrt of -1. Multiplication of the complex number by its complex conjugate will return a number with a real only component:

num_1 * num_1.conjugate()


Another builtin class is the range class which can be used to produce a sequence of numbers:

Recall that Python uses zero order indexing, meaning we are inclusive of the lower bound but exclusive of the lower bound. The range function has 1-3 positional arguments. If all 3 are specified these correspond to:

range(start, stop, step)

The step is assumed to be 1 if only 2 positional input arguments are supplied:

range(start, stop)

The step is assumed to be 1 and the start is assumed to be 0 if only a single positional input argument is supplied:


The range object has the start, stop and step attributes. Let's create a range object with only the stop value supplied:

# %% create range instance
rng = range(10)

We can access a number of attributes by typing in the object name of the range instanced followed by a dot . and a dropdown list will be displayed by Kite:

The attributes start, stop and step correspond to the values specified when instantiating the range object:

# %% examine attributes
start = rng.start
stop = rng.stop
step = rng.step

We can see that start=0 and step=1 (the default values) when only stop is supplied in this case stop=10 was assigned:

To understand a range object in more detail we can cast it to a list:

# %% create numeric list from range instance
range_list = list(rng)

Note that the list begins at 0 and increments in steps of 1 until it reaches the value of 10. The value of 10 (the upper bound) is however excluded as we use zero-order indexing and therefore the last value on the list is 9:

Let's now examine a range object which has a start and step specified:

# %% create range instance
rng = range(1, 10, 3)
# %% examine attributes
start = rng.start
stop = rng.stop
step = rng.step
# %% create numeric list from range instance
range_list = list(rng)

Now we can see that we start at 1 (inclusive) and go up until steps of 3 until we reach 10. We don't reach 10 and the last value before 10 is 7:

enumerate function

The range object is a sequence of numbers. A related enumeration object can be made by the enumerate function which expects an object with an index such as a list:

Let's create a code block to create a list, a code block to create an enumeration object and a code block to cast this to a list so we can see what this object is:

# %% create list
list_1 = ["hello", "world"]
# %% enumerate list
enum_list_1 = enumerate(list_1)
# %% cast to list
list_2 = list(enum_list_1)

Running the 1st code block creates the list of str. With index 0 and index 1:

The enumeration object is created after running the 2nd cell but not much details about it display on the variable explorer:

Casting back to a list shows that each value in the list is now an index. The 0th index of each tuple is the original value of the index:

Both the range object and the enumerate object are useful for iterating in Python. This is discussed in a separate guide which moves away from procedural programming and examines the use of code blocks.