Getting Started with Python and the Spyder 5 IDE (Object Orientated Programming and the dot Syntax)

Video

object orientated programming

Python is an Object Orientated Programming (OOP) language where everything we interact with is an object. This guide will use the Spyder 5 IDE to explore some of the main concepts behind object orientated programming focusing specifically on the dot syntax.

If you haven't already done so, please follow my instructions to install Anaconda 2021-05 and optionally update to Spyder 5.

Python modules

A module is a Python script where a number of objects are assigned.

The assignment operator = is used for assignment:

object_name = value

It is perhaps more useful to approaching the line above from the centre, then the right hand side and then the left hand side. Doing so translates the syntax to English language, producing the sentance:

assign the value to the object_name

Notice that in the English language syntax we have additional words to make the sentence sound grammatically correct. In Python often we want to simplify things to increase typing efficiency and therefore do not include these filler words.

In Spyder in the script editor, let's assign the float value 3.14 to the object name pi using the assignment operator =.

We can then save this Python script by using File→Save as…

And in this case save it as script0.py:

Now we will select File→New File…

We will then save it as script1.py

Now that both Python script files are saved, for convenience, we can right click the tabs and select Split Vertically:

Now we will look at multiple ways of accessing the objects defined in script0 in the new script1. script0 can be considered a module i.e. a script file where objects are assigned.

from module import object

We can import an object from a module using the from and import keywords. This has the general form:

from module import object

Note that when we reference the module (script0.py) we do not include the .py extension.

It may be helpful to conceptualize the module as a box and the object as an item in the box.

The line above therefore would become:

from box import item

In English language syntax we would probably approach this statement from the right hand side first and change this to:

import the item from the box

Notice the analogy of the approach discussed when using the assignment operator i.e. changing the order and removing the filler words.

Let's add the following code to our script files and save each script.

script0.py

pi = 3.14

script1.py

from script0 import pi
print(pi)

In the files tab we can see both scripts are in the same folder. Highlighting script1.py we can run it:

We can see that the float object pi is imported and displays on the variable explorer. script1 will also print the output to the console.

objects: variables and functions

We can use the assignment operator to assign another object i.e. use the form:

object_name_0 = value_0
object_name_1 = value_1

We can also define a function using the def keyword, followed by a space and then the functions name. On the same line parenthesis is used to enclose any input arguments (none are supplied for this particular function) and this line ends with a colon : which is an indication to begin a code block.

Any code belonging to the code block is indented by 4 spaces. The first three lines of the code block are the document string """ """.

Functions can optionally include a return statement which can be used to return an output.

def function_name(*args, **kwargs):
    """
    docstring
    """
    code
    return output

In English we can think of the top line as being:

define the object function_name to be a function (which takes the following positional input arguments *args and keyword input arguments **kwargs):

The code belonging to this function is …

Use the code above to create an output from a calculation involving *args and **kwargs

return the output

In Python, object_name_0, object_name_1 and function_name are all objects. object_name_0 and object_name_1 are both variables which have merely been assigned a value (using the assignment operator). function_name on the other hand is a function and a function is designed to operate on provided input arguments (if applicable) to perform some operation and return an output variable or print an output variable to the console.

Let's restart the Kernel (select Consoles and then Restart Kernel) and update script0 to be of this form.

script0.py

pi = 3.14
e = 2.72

def greeting():
    """
    prints hello to the console.
    """
    print("hello")

When we run script0 directly, notice that e and pi display on the variable explorer but the greeting function doesn't.

We can have a look at the local scope of the console otherwise known as local directory of the console by typing in:

dir()

At the moment we can ignore the datamodel methods which begin and end with a double underscore __ (also colloquially known as double underscore or __dunder__ methods), the private object names which begin with an underscore _ and exit and quit which are always present in the console's namespace.

Specific of interest to us we can see the objects e, greeting and pi are all in the consoles local directory.

In the console we can look at the three objects by typing in their object name:

pi
e
greeting

In the case of the pi and e (variables), the values assigned to these object names are printed to the console while in the case of greeting (a function) we are only informed that it is a function.

In other words all we have done is reference the objects above. In the case of a function we can either reference it or call it. To call a function we need to provide parenthesis alongside any required input arguments (this particular function has None as it will always print the same static word hello).

Notice that as we type the function name followed by open parenthesis, that the tool tip appears and displays the functions docstring.

In this case the function has no input arguments but still must called by use of parenthesis.

greeting()

from module import object1, object2, …

We can use the same notation as previously used to import a single object, to import multiple objects. All we need to do is include a comma as a delimiter for all the objects we wish to import. i.e. we import multiple objects using the form:

from module import object_0, object_1, object_2, ...

Once again it can be useful to think of the module as a box:

from box import item_0, item_1, item_2, ...

And to construct the syntax in English language:

import the items item_0, item_1, item_2, and … from the box

Let's restart the Kernel (select Consoles and then Restart Kernel) and add the following code to our scripts.

script0.py

pi = 3.14
e = 2.72

def greeting():
    """
    prints hello to the console.
    """
    print("hello")

script1.py

# %% Imports
from script0 import pi, e, greeting
# %% printing variables (instances)
print(pi)
print(e)
# %% calling functions
greeting()

We can run the zeroth cell of script1:

Once again we will see e and pi appear on the variable explorer. The function greeting will not display but it will be in the consoles local directory which we can check once again by using:

dir()

Because these are assigned, running the remaining cells will print the values of the variables and call the function as expected.

from module import *

Let's restart the Kernel (select Consoles and then Restart Kernel). Now instead of importing each object individually * (which denotes all) can be used to import all objects available from a module.

from module import *

Once again conceptualizing the module as a box:

from box import *

And taking * to be all we can construct the English language syntax:

import all items from the box

Let's update our scripts to be of that form:

script0.py

pi = 3.14
e = 2.72

def greeting():
    """
    prints hello to the console.
    """
    print("hello")

script1.py

# %% Imports
from script0 import *
# %% printing variables (instances)
print(pi)
print(e)
# %% calling functions
greeting()

While the code executes with e and pi displaying on the variable explorer and e, pi and greeting being imported into the consoles name space you will see several warnings in script1.py.

The * import is not normally recommended as it unclear where each object is imported from. Also if you are dealing with large scripts with hundreds to thousands of lines of code each, it is possible that both of these scripts may use a common object name. Take the analogy of having two large boxes full of Lego and emptying (importing all Lego bricks) on the floor. You will be unsure of what box each Lego brick came from.

We can illustrate this by creating three scripts, script1 will import everything from script0 and script2 and both of these scripts will have an object name e defined. When the 0th code block is ran, e becomes the float 2.72 as imported from script0.

However when the next code block is ran, e gets reassigned to the str "e" as imported from script2. In the scenario where the user was not interested in the object e from script2 but only interested in other possible objects from script2 (not included for clarity). It is likely that they would run into other issues such as a TypeError due to e being accidentally redefined from the float 2.72 to the str "e".

import module and module.object

We can also directly import the module using the form:

import module

Then we can access objects defined from the module using the dot syntax.

module.object_1

It may be helpful of thinking once again as the module as a box and the . as an arrow → i.e.

box→object_1

The definition of this syntax is essentially taking object_1 out of this box (in this case module).

Let's restart our kernel and update our scripts to the following lines of code.

script0.py

pi = 3.14
e = 2.72

def greeting():
    """
    prints hello to the console.
    """
    print("hello")

script1.py

# %% Imports
import script0
# %% calling functions
script0.greeting()

If we run the code notice that the objects pi and e do not display on the variable explorer and the function greeting was executed as can be seen from the console output.

If we look at the local directory of the console:

dir()

We can see that only script0 is listed:

We can look at the local scope of script0 using:

dir(script0)

We can see the autocompletion in the console if we type in the name of the module followed by a dot (the console uses Jedi completion and doesn't have full integration with kite so a tab is required to show the list of objects which can be called from the module).

Moving over to the script editor which uses Kite, we can see that the function greeting, the text (variables) pi and e are shown.

import module as alias

We can also import the module as a custom object name or alias.

import module as alias

We then need to use dot indexing from the alias object name:

alias.object

script0.py

pi = 3.14
e = 2.72

def greeting():
    """
    prints hello to the console.
    """
    print("hello")

script1.py

# %% Imports
import script0 as alias
# %% calling functions
alias.greeting()

Notice that when we look in the local directory of the console we get alias and not script0. We can access the objects pi, e and greeting from alias using the same dot syntax as before.

Class attributes and methods

Let's go back to script0 and have a look at the variables pi and e in more detail. If we type in either variable or instance object name followed by a dot, we see that both of these objects have a number of other objects which can be referenced from them. An object that can be referenced with respect to another object is termed an attribute. We can see the clear analogy with the module and object with the instance and attribute:

module.object
instance.attribute

Note that the list of attributes that can be referenced from both variables pi and e have identical object names.

They are identical because both e and pi are instances of the same class. i.e. the float class which can be thought of as a blueprint for this datatype.

Let's have a look at the function is_integer. This is a very simple function which returns a bool True value if the float is also an integer and a bool False value if it is not.

This function can also be called directly from the float class. Comparing the two we see that the latter has the positional input argument self and the former doesn't. In the former case, because the function was called directly from an instance, the instance is provided as self. In other words self is a term used as a placeholder for the instance.

This function can be called in the following two ways, from the instance where self is implied and from the class where self is provided:

pi.is_integer()
float.is_integer(pi)

If we reference the function without parenthesis we get informed that it is a function when called from an instance (i.e. get the same behaviour as when we reference any other function) but when we reference it directly from the class we get informed that it is a method.

The difference between these two terms is subtle.

The function is_integer called from the float instance pi is a method of the float class.

A method for all intents and purposes is a function but its 0th input argument must be an instance denoted with the positional input argument self.

Let's now look at the attribute real and imag which originate from complex numbers (recall that the imaginary component of a complex number originates from the definition of the square root of -1).

The float pi is a real number and therefore the attribute real should return the value of pi and the attribute imag should return 0.

When the attribute real is called from the instance pi, the value of pi 3.14 is returned. When the attribute pi is called from the class float we are just informed that it is an attribute.

The value returned 3.14 is itself an instance of the float class. Therefore will give an identical list of attributes defined by their blueprint the float class:

pi.
pi.real.

Under the hood attributes are normally created and accessed using the get and set methods. Multiple attributes can be created when an instance is from the blueprint of the class is instantiated. Instantiation uses the datamodel initialization method __init__ (also known colloquially as a double underscore or dunder method).

We can use the keyword class to begin making the blueprint for our own custom class. The class name usually uses CamelCaseCapitilization and is followed by parenthesis. The parenthesis is used to enclose the parent classes and this line ends with a colon : which is an indication to begin a code block.

class ClassName(object):
    code

CamelCaseCapitlization is where each word begins with a capital letter and no spaces are used differing from the convention snake_case used for object names and functions where each word is lower case and an underscore is used to separate them. It should be noted however that the built-in classes such as int, float, bool, list, tuple and dict use snake_case and the CamelCaseCapitilization used for custom classes makes it more obvious to other programmers that they are using a custom class.

The parent class is a class which the current class is based upon, it can inherit methods and attributes from the parent class. Think of the blueprint of an every day object such as a car. If a small change to the car is made by an engineer, such as a better type of tyre being used. The engineer does not need to make an entire new car blueprint from scratch, he can just tell the production team to use the existing blueprint for everything else and to use the new tyres.

Within the class code block are a number of methods that belong to the class indicated by their indentation guides. Note that all of these have the 0th positional input argument self. One of these is the datamodel method __init__ which is executed when the class is instantiated. This can be used to create attributes upon instantiation and to call other methods during instantiation.

class ClassName(object):

    def __init__(self, *args, **kwargs):
        self.attribute_0 = value_0
        self.method_2(*args, **kwargs)


    def method_1(self, *args, **kwargs):
        code


    def method_2(self, *args, **kwargs):
        self.attribute_1 = value_1
        self.attribute_2 = value_2

Let's look at a basic practical example and create a Coordinate class with 2 attributes x and y which will be initialized when an instance of a class is created using the classes datamodel method __init__. Let's also define a method distance which calculates the distance between the current co-ordinate designated self and another co-ordinate which will be designated with the positional input argument other.

# Defining a class

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


    def distance(self, other):
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**0.5

Now in the console let's create an instance.

c = Coordinate(1,2)

We can open the instance c in the variable explorer. To the right hand side we see the path of the Coordinate instance object c itself, the method c.distance and the int attributes c.x and c.y which are referenced with respect to the object c and called using dot notation.

Although we see the instance c on the variable explorer, we do not see the class Coordinate on the variable explorer. Once again we can look up the local directory of the console.

dir()

If we examine look at:

dir(c)

We will see all the datamodel methods that are defined for our instance c. Most of these with the exception to __init__ which we defined directly take on the default behaviour defined by the parent class object.

These can be examined in the variable explorer by selecting show __special__ attributes:

Since we used the dir function on this object, we can see that the special method __dir__ maps to the behaviour of this function.

If we input the instance c to the console we are informed that c is an object of the Coordinate class.

We can change this default behaviour by defining this method in our custom class and returning a str.

# Defining a class

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


    def distance(self, other):
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**0.5
    
    
    def __repr__(self):
        return (f"({self.x} , {self.y})")

Rerunning the script and then instantiating c, We can have a look at the console output of:

c

This defines the default behaviour of the function repr:

repr(c)

Also the datamodel method (although not commonly done so) can be called from the instance and from the class by specifying an instance self:

c.__repr__()
Coordinate.__repr__(c)

Another related datamodel method is __str__. This maps to the behaviour of the str class on the object which can be stored to a object (that is an instance of the str class) and also the print function.

It is common to make the output of __repr__ more similar to how you instantiate the function and __str__ as a more informal kind of representation.

# Defining a class

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


    def distance(self, other):
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**0.5
    
    
    def __repr__(self):
        return (f"Coordinate({self.x} , {self.y})")
    
    
    def __str__(self):
        return (f"({self.x} , {self.y})")

Then we can see:

c

Gives the formal representation (defined by __repr__).

strc = str(c)

Stores the informal representation of c defined using the datamodel method __str__ to a str, strc which displays in the variable explorer. This could for example be nested in a report about the co-ordinate c.

print(c)

Prints the value of c using the informal representation defined by the method __str__.

There are numerous other datamodel methods available which for example map to the comparison operators ==, >=, > !=. <=, <. If we wanted to we could redefine all of these for our Coordinate class (sometimes it makes sense to do so and other times it doesn't).

There are a number of other operators typically used for numeric datatypes such as the + operator. The behaviour of this operator is defined by use of the __add__ datamodel method and because it is not defined for our Coordinate class we get an error when we attempt to use it.

If we look to classes that we do know such as the str class that do use the operator +. The code completion when explicitly using this datamodel method expects a self and value positional input argument.

We can see that the str class uses the + operator to perform concatenation whereas the int class uses it to perform numeric addition. Under the hood these classes have a different definition for the datamodel methods.

The str class has no __sub__ datamodel method defined and therefore a TypeError shows when it is attempted to be used. In the case of an int class it performs subtraction.

Most custom classes that we make will use the operation defined in an inbuilt class (int, float or a str) as a basis. The attributes x and y in the Coordinate class are each int instances for example. We could define addition as the summation of the x attributes in the instances self and other to give a new x attribute and the summation of the y attributes in self and other to give a new y attribute and then use the methods return statement to return a new Coordinate instance with these new attributes.

# Defining a class

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


    def distance(self, other):
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**0.5
    
    
    def __repr__(self):
        return (f"Coordinate({self.x} , {self.y})")
    
    
    def __str__(self):
        return (f"({self.x} , {self.y})")
    
    
    def __add__(self, other):
        new_x = self.x + other.x
        new_y = self.y + other.y
        return(Coordinate(new_x, new_y))

Now:

c + c

Displays Coordinate(2, 4) as expected, recall that the output isn't returned to an object so it is instead output to the console which is defined using the __repr__ datamodel method.

Let's restart the Kernel create two Coordinate instances c and d and check if our distance method works.

c = Coordinate(1, 2)
d = Coordinate(4, 6)
c.distance(c)
c.distance(d)
c + d

The distance from c to c is 0 and from c to d is 5.0 (as the data leads to the 3, 4, 5 Pythagoras triple). The + operator also works as expected.

importing a class from another module

The code in script0 is 26 lines (left hand side). We could go on and on defining all the other datamodel methods as well as additional attributes and methods and we could create detailed docstrings for each method making the lines of code easily a couple hundred of lines.

The code on the right hand side is only 5 lines and perhaps at any given point in time we are only interested in using a subset of the class methods. In this scenario it does not make sense editing script0 but instead importing it and using it.

Now let's reference a class from another file.

script0.py

# Defining a class

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


    def distance(self, other):
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**0.5
    
    
    def __repr__(self):
        return (f"Coordinate({self.x} , {self.y})")
    
    
    def __str__(self):
        return (f"({self.x} , {self.y})")
    
    
    def __add__(self, other):
        new_x = self.x + other.x
        new_y = self.y + other.y
        return(Coordinate(new_x, new_y))

script1.py

# %% Imports
import script0
# %% Instantiate Coordinates
c = script0.Coordinate(1,2)
d = script0.Coordinate(3,4)
# %% Access Attributes
xcoord_c = c.x
# %% Call Method
pythogoras_len = c.distance(d)

builtin modules

Up until now we have been writing all our own code so we could understand how importing objects from another a module works. The Python programming language has a number of inbuilt modules. Builtin modules are modules that are builtin to Python and typically written in another programming language such as C for performance purposes. As a consequence we cannot access the source code of these modules but we can import them and use the objects defined within them.

sys module

Let's start by skimming over the sys (system) module. We can import it as if it is a file.

import sys

Because it is a builtin module, Kite (the autocompletion) will list the most commonly used objects from the module at the top followed by the list of objects available alphabetically.

We can use the dot notation to access the list object path and can assign it to an object name so it displays on the variable explorer:

import sys
path_list = sys.path

Now we can expand the list using the variable explorer. What we see is a list of where Python will look for objects to import from if it cannot find an object in the consoles directory or within the currently opened folder.

If we move script2 to a subfolder and attempt to import it we will get a ModuleNotFoundError. This is because it is in a different folder to the script being executed script0 and not present in the path:

As sys.path is a list of str, we can append a str object to the end of the path using the list append method.

We can copy and paste the path from Windows Explorer to a relative str. The relative str will automatically convert / to // because / is used to place escape character within a Python str. e.g. \t for tab and \n for new line. In the case of a file path / must be converted to // within the str context, the first / denotes placement of a special character within the str and the second / denotes that the special character is /.

subfolder = r"C:\Users\Phili\Documents\Python_Scripts\subfolder"
sys.path.append(subfolder)

Now when we attempt to import script2 it will also look within the subfolder and successfully import it. The dir() command will find the object script2 in the consoles directory.

We can also see that path_list on the variable explorer shows our subfolder added to the end of the path.

Restarting the Kernel will restore sys.path to its original state.

The modules in index 0 (builtin modules) on the sys.path list are not accessible as .py script files. Under the hood these are written in other programming languages such as C:\ and are known as builtin modules. We can use dot notation to view the object builtin_module_names from the sys module and assign it to the variable builtin_modules.

import sys
builtin_modules = sys.builtin_module_names

In the variable explorer we see that this is a list. We can scroll through it ignoring the private modules at the top that begin with a single underscore _, these private modules are not designed for the average Python user but exist for the purpose of Python maintainers. Within the builtin module names we can see some of the most commonly used modules such as sys, builtins, math and time.

builtins module

We see one of the modules is called builtins. If we use dot indexing from the module builtins we see a list of all the inbuilt objects in standard Python:

import builtins

This can be useful to reference as a beginner as Kite lists the most commonly used builtin objects and then everything else alphabetically. In this list you will quickly see what is a class, a function or an instance of a class (shown as text). Among this you will see a many Error Classes which are used to flag up common errors.

There is no point importing anything from this module as by default these are already builtin (by definition). for example the class builtins.str is the class str.

math module

Let's have a look at the builin module math. Once again we can import from the module as if it was a script file.

import math

Then we can use dot indexing from the module math to view mathematical functions and mathematical text objects.

We can access two constants pi, e and a function sqrt from this module using dot indexing from the module.

import math
x = math.pi
y = math.e
z = math.sqrt(4)

Alternatively, we can import each of these objects individually. The instances pi and e will display on the variable explorer however the function sqrt will not but it will be seen in the consoles directory i.e. following the same behaviour when we imported our own custom functions from our custom module.

from math import pi, e, sqrt
pi
e
sqrt_4 = sqrt(4)
print(dir())

time module

The builtin time module is used for time access operations for example when running a Python script. Let's import it:

import time

Now we can use dot notation from time. We can use the sleep method to pause the script for a specified time. This can be useful when printing output to the console, ensuring the user has time to read the output. Once again we can expand the tooltip:

For example if we want to print "hello" to the console and display it for 10 seconds, then execute code which prints a lot of rapid information to the console (in this case a for loop of spaces) and finally ends with a print of "goodbye" we can use:

import time
print("hello")
time.sleep(10)
for i in range(100):
    print(" ")
print("goodbye")

standard modules

Additional modules are found in the site-packages folder. There are a number of standard modules which are included in every single Python installation. The Python Module Index gives more details about these Standard Modules.

We can go to this folder and access the source code of these modules by opening the .py file in Spyder or NotePad++.

The file location using a Windows 10 Environmental Variable and an Anaconda installation is more generally:

C:\Users\%UserProfile%\Anaconda3\Lib

In Linux the location has the form (change your username and select your current Python version):

/home/philip/anaconda3/lib/python3.8

random module

Now let's have a look at the inbuilt module random which can be used to generate random numbers. Once again we can import from the module as if it was a script file.

import random

Then we can use dot indexing from the module random to view the functions available.

We can use the function seed, to return the random seed to its starting position (this will always produce the same random number for consistency). We can use the functions random (note the function random is called random and the module is called random making the reference random.random), randint and choice to generate a random number between 0 and 1, a random integer with a specified lower and upper integer bound and from a list of choices respectively. In this example, all the objects when used are called from the random module directly.

import random
random.seed(0)
a = random.random()
random.seed(0)
b = random.randint(0, 10)
random.seed(0)
c = random.choice(["Heads", "Tails"])

We can also open the random.py file. We see that it is quite involved with a length of 28,802. However at the top of the file we can see that it generally uses imports from other modules.

A class Random is defined:

A method seed is defined:

And you can scroll through the file to see how the methods random, randint and choice are defined. In general a deep understanding of the code in the module file is not required although it can be education to look through the source code.

Since we only used four objects from this module, we can also import them directly.

from random import seed, random, randint, choice
seed(0)
a = random()
seed(0)
b = randint(0, 10)
seed(0)
c = choice(["Heads", "Tails"])

keyword module

Let's have a look at something much simpler, the keyword.py module. In this case we can see that the module essentially includes a list object line 17-52, kwlist which is a list of type str of all the keywords. A function is defined iskeyword on line 55.

We can import this module

import keyword

Then we can use dot indexing from the module keyword to view both the function iskeyword and the list object kwlist.

We can check if the str "def" is a keyword and assign the output to x and assign y to be the list object kwlist.

import keyword
def_is_keyword = keyword.iskeyword("def")
kw_list = keyword.kwlist

We can see def_is_keyword in the variable explorer with he value True and kw_list in the variable explorer. We can open kw_list and view it in more detail, in this case seeing kw_list as a keyword:

fractions module

Let's have a look at the fractions.py file. At the top we can see that this module is dependent on other modules. We can see that this module imports the class Decimal from another module decimal. The math, numbers, operator, re and sys module are all imported.

We can also glance through the definition of the class Fraction.

In Spyder if we intend to use fraction objects regularly within our script, it may be more convenient to import the fractions module it as a 2 letter abbreviation fr.

import fractions as fr

Now we can use dot notation from fr. Let's create a Fraction class, we see the autocompletion gives us the docstring instructing us how to initialize a new instance (which under the hood uses the classes __init__ method).

Now we can create two instances a and b and perform a mathematical operation between them to get a third instance c.

import fractions as fr
a = fr.Fraction(5,10)
b = fr.Fraction(2,10)
c = a + b

We can see the instances of the Fraction class; a, b and c display in the variable explorer. Let's expand one of these. To the top we see a number of private objects which begin with an underscore. We then see the objects that can be called from the object c (which is a Fraction instance).

We see these in the Kite autocompletion when we type in the instance c followed by a dot.

There are a number of datamodel methods which can also be viewed.

The datamodel method __add__ which we examined earlier defines the behaviour of the + operator between the instance self and the instance other. The datamodel method __str__ defines how an instance of the fraction class operates when the functions str or print are used and the datamodel method __repr__ defines how an instance of the fraction class operates when the function repr is used (as we also seen before).

statistics module

Let's have a look at the statistic.py file. We can see that this module imports the math, numbers and random module and also imports specific objects from other modules.

Let's import the module statistics:

import statistics

Now we can use dot notation from statistics. Let's have a look at using the mean function, we see the autocompletion gives us the docstring instructing us how to use this function.

Let's create a list of data and carry out some basic statistics on it:

data = [1, 1, 1, 1, 2, 2, 3, 3, 4]
import statistics

data_mean = statistics.mean(data)

data_variance = statistics.variance(data)
data_stdev = statistics.stdev(data)

data_median = statistics.median(data)
data_mode = statistics.mode(data)

datetime module

The datetime module is a module for working with date and time variables. Let's import the module datetime:

import datetime

Now we can use dot notation from datetime. Supposing we are interested in the datetime class (called from the datetime module), we can type it with open parenthesis which will show a popup balloon tooltip with part of the docstring.

If we follow the link at the bottom of the tooltip it will open the full docstring in the Spyder help pane.

We can also highlight the object of interest and type [Ctrl] + [ i ] to inspect the object in the help pane.

We can a datetime instance and assign it to the variable a. We can then assign a timedelta instance to the variable b and create a new datetime instance c by adding a to b.

import datetime
a = datetime.datetime(year=2021, month=1, day=1)
b = datetime.timedelta(days=1)
c = a + b

data science packages

The site-packages folder in an Anaconda3 installation contains a number of data science libraries which are usually in the form of Python Packages. In windows explorer these site-packages are essentially subfolders.

The site-package contains a number of Python modules. Each site-package must however include an __init__.py file which is initialized when then site package is referenced. This follows the analogy of the __init__ datamodel method being involved when a class is instantiated.

The first 14 lines of code in the __init__.py module contain a number of lines of code which import (all indicated by a *) objects from most of the other modules of the package. Note that there is no object name specified before the dot in all of the imports which indicates that we are importing these modules from the same folder as the __init__.py file.

Line 17 also imports matplotlib (another package) or data science library that seaborn is built upon. seaborn can be considered as a wrapper around matplotlib and includes a number of functions which can be used to rapidly create plots that are commonly used for data visualization. All of these plots could in theory be created in matplotlib directly but doing so would normally require much more lines of code to get to the same result.

In the seaborn folder (package) we see the subfolders colors, external and test. These folders are all subpackages and these subpackages all also contain their own __init__.py file.

We import the data science library in an identical manner as importing a module. For commonly used data science libraries we generally use a 2-3 word alias. In the case of seaborn this is sns:

import seaborn as sns

And once seaborn is imported as the alias we can use dot notation to call items from the package. Let's use the lineplot function to create a lineplot.

The only two keyword input arguments we need to provide are the x and y data to be plotted which we will define as numeric lists.

# %% Imports
import seaborn as sns
# %% Generate Data
x_data = [1, 2, 3, 4, 5]
y_data = [0.95, 2.05, 2.95, 4.05, 4.95]
# %% Create a Line Plot
sns.lineplot(x=x_data, y=y_data)

The plot is shown on the plots pane in Spyder but looks kind of bland (using matplotlibs normal plot defaults).

seaborn has a number of inbuilt styles. The function set_style can be used to apply one of these styles to all figures made via seaborn or directly via matplotlib. We can have a look at the docstring for this function and use the 0th style listed, "darkstyle".

Rerunning this code with the darkgrid style will create a better looking plot:

# %% Imports
import seaborn as sns
sns.set_style("darkgrid")
# %% Generate Data
x_data = [1, 2, 3, 4, 5]
y_data = [0.95, 2.05, 2.95, 4.05, 4.95]
# %% Create a Line Plot
sns.lineplot(x=x_data, y=y_data)

Covering deeper usage of seaborn requires perquisite knowledge of matplotlib which is not the focus of this guide.

There are a multitude of datascience libraries available in the Anaconda installation. The most notable however are the ones that are considered the primary data science libraries which are conventionally imported using a 2-3 word letter alias:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

additional packages

Although the Anaconda Python distribution contains the most commonly used data science libraries, there are literally thousands of Python packages available to download and install. These can be installed using conda install commands in the Anaconda Powershell Prompt (Windows 10) or Terminal (Linux). conda install commands should be used opposed to pip install commands where possible (a google search of conda install "package to be installed") will usually take you to the Anaconda website with install instructions.

conda install commands check for inter-module/inter-package dependencies and will aim to address any conflicts. This does not happen with pip installs which can result in problems. Recall for example that seaborn was a wrapper for matplotlib and used matplotlib in the background. A newer version of matplotlib may change the behaviour of a seaborn plotting function therefore breaking seaborn functionality. seaborn is therefore designed to work with a specific version of matplotlib.

Let's use the python-docx package as an example:

conda search --channel conda-forge python-docx
conda install --channel conda-forge python-docx

Now that this module is installed, the docx package displays.

We can then copy and paste their example code in script0 and comment out the line of code for adding a picture. Running script0 generates the demo.docx file which can be opened in Microsoft Word:

careful consideration of object names

Care should be taken when creating object names and python script file names. If script1.py was renamed docx.py then the line of code:

from docx import Document

Would examine the current folder and find the docx.py module. It would then stop its search for docx. Within our custom docx module it would then look for the class Document and import it. The docx package in the site-packages folder would not be examined.

Beginners in particular sometimes run into this problem if they name their their first scripts for example numpy.py or pandas.py when taking notes on these libraries.