Importing Standard Modules and Data Science Libraries

Anaconda is a Data Science Python Distribution which includes:

  • Python
  • Python Standard Modules
  • The Numeric Python library – numpy
  • The Python and Data Analysis library – pandas
  • The matrix plotting library – matplotlib

This tutorial will look at importing these modules and libraries and what it means physically.

Anaconda PowerShell Prompt

Open the Anaconda PowerShell Prompt from the Start Menu:

img_001

By default it will open in %USERPROFILE%:

img_002
img_003

Notice the Anaconda PowerShell Prompt begins with (base), this means the (base) Python environment is selected:

img_004

This is the python.exe found in the Anaconda3 folder:

img_005
img_006

The Anaconda PowerShell Prompt uses the Programming Language PowerShell PS by default:

img_007

The > indicates a new prompt:

img_008

The command prompt has the following syntax:

command option -p parametervalue1
img_009
command option --parametername2 parametervalue2
img_010
command option --parametername3
img_011

Python

To launch Python from the Anaconda PowerShell Prompt input:

python
img_012

Notice details about the Python version display alongside a new Prompt >>>.

Python uses the following functional syntax which is different to the command line syntax seen above:

function(value1, arg2=value2)
img_013

Note that these are two different programming languages.

Importing Libraries

Notice the base Python environment has a Lib folder:

img_014

This contains the Python standard modules such as email:

img_015

If the module is examined it has a __init__.py file which is the default Python file imported when a folder is referenced.

img_016

The module can be imported using:

import email

And the path of the physical file can be examined using the data model attribute __file__:

email.__file__
img_017

Note every \ is replaced with \\, as \ is used to insert an escape character in a Python string, in this case the escape character to be inserted is also \.

The email standard module has submodules which can also be accessed using a . and in Python the . essentially means belonging to this object. Note that the .py file extension is not included in an import statement and therefore there is no confusion with the . used to indicate a file extension.

For example:

email.charset

would reference the charset.py in this email folder.

Some standard modules are smaller and are not contained in a folder. For example the datetime module is a single datetime.py file:

img_018

It can be imported and details of its file can be examined using:

import datetime
datetime.__file__
img_019

The third-party data science libraries are found in the site-packages folder:

img_020

There is normally a folder that is the name of the library containing the Python script files alongside a folder that states the version:

img_021

The numpy library has a __init__.py file which is the Python file imported when numpy is imported:

img_022

As numpy is very commonly used it is typically imported using the 2 letter alias:

import numpy as np
np.__file__
img_023

The pandas library has a __init__.py file which is the Python file imported when pandas is imported:

img_024
img_025

As pandas is very commonly used it is typically imported using the 2 letter alias:

import numpy as pd
pd.__file__
img_026

The matplotlib library has a __init__.py file:

img_027
img_028

However typically only a module of this library is used called pyplot:

img_029

As pyplot is very commonly used it is typically imported using the 4 letter alias:

import matplotlib.pyplot as plt
plt.__file__
img_030

The locations of these modules and libraries can be seen together:

img_031

To exit python use the function exit:

img_032

Notice because this is a function it is followed by parenthesis.

To exit the Anaconda PowerShell Prompt use the command exit:

img_033

Notice there is no parenthesis.

There is a difference in syntax as Python and PowerShell are two different programming languages.

Return to Anaconda Tutorial