The Anaconda Python Distribution 2021-11 (Linux)

Installing the Anaconda Python Distribution with the Spyder 5 IDE and JupyterLab 3 IDE on Ubuntu 20.04 LTS or 21.10 and Best Practices for Managing a conda Environment

One-Time
Monthly

This website is maintained by an individual and technology enthusiast, Philip Yip. Although I have been recognised as a Dell Community Rockstar and Microsoft MVP, I am affiliated with neither company. If you've found my tutorials helpful, please consider making a one-time small donation to offset the WordPress Premium Plan costs to host the website.

This website is maintained by an individual and technology enthusiast, Philip Yip. Although I have been recognised as a Dell Community Rockstar and Microsoft MVP, I am affiliated with neither company. If you've found my tutorials helpful, please consider making a monthly small donation to offset the WordPress Premium Plan costs and the costs for buying computer hardware. I am also spending a considerable amount of time doing some programming courses (Python, C++, Qt and Arduino) and hope to write some more programming tutorials.

Choose an amount

£5.00
£10.00
£15.00
£3.00
£9.00
£60.00

Thank you very much.

Thank you very much.

DonateDonate monthly

Operating System

This guide will only cover, the Linux Operating System using Ubuntu 21.10 as an example. The instructions should also work for Ubunt 20.04 LTS and any Ubuntu based distro such as Linux Mint, Zorin OS KDE Neon and other Linux distros such as Fedora and DeepIn.

Installation requires using the terminal to bash install a .sh file and care needs to be taken to initialise the conda install, so that the terminal recognises the conda commands. Unlike Windows Start Menu shortcuts are not created in Linux, so the Anaconda Navigator and usually Spyder and JupyterLab need to be run via the terminal.

Note some conflicts can arise (mainly due to a previous version of Anaconda leaving old configuration files and environments behind). I will discuss purging these to allow for a clean installation.

A Windows Guide is available below:

Python and the Python Install Package (pip)

Python beginners are recommended to install Anaconda opposed to installing Python…

The installer from Python.org contains the Python Programming Language and a handful of inbuilt modules such as datetime and math used for very basic datetime operations and very basic mathematic operations respectively. It also includes pip (an abbreviation for Python Install Package) which can be used to install third-party packages otherwise known as Python Libraries.

pip is not very beginner friendly… It will first allow one to install a package without its dependencies. For example the data science library seaborn is a plotting package based on matplotlib and matplotlib is built upon numpy. seaborn requires the dependencies numpy, pandas, matplotlib and scipy in order to work correctly. Moreover there is the problem with versions. If there is a chance in how numpy works, this may break some functionality in matplotlib and this may in turn break some functionality in seaborn. This can cause huge headaches particularly for beginners.

The conda package manager

Miniconda and Anaconda instead use the conda package manager. The conda package manager will install the Python package alongside its required dependencies. It will also check for incompatibilities between packages already existing in the conda environment and attempt to solve them.

It is recommended for beginners to always use:

conda install package

In preference to:

pip install package

wherever possible in order to avoid incompatibilities. This will be explained in a bit more detail later.

Miniconda vs Anaconda

Miniconda is free to use (for home and commercial use) and has the conda package manager and an empty base environment.

Anaconda is licensed however the Individual Edition is free for Individual use. Anaconda has the conda package manager and a base environment that contains the Python programming language as well as the most commonly used datascience libraries such as numpy, matplotlib, pandas, scipy and seaborn (amongst a multitude of others). The Anaconda base environment contains a number of Python Integrated Development Environments (IDEs) that is programs for writing, debugging and running Python code. The base Anaconda environment in the 2021-11 installer includes Spyder 5.1.5 and JupyterLab 3.2.1 which are particularly stable.

conda channel and conda-forge channel

The conda package manager is commonly used to install Python packages from two channels, the conda channel and the conda-forge channel.

The conda channel which is maintained by the Anaconda company and who have spent more time testing specific (usually more stable package versions together).

The conda-forge channel is the community channel with commitments made directly by the package developers. It is usually (but not always) more up to date. It also has a far larger number of packages available, such as the smaller less commonly used packages.

conda environments

If you want to use a Python package that is not included in the base environment you can install it directly into the base environment and the conda package manager will solve the environment if the new package doesn't have many dependencies.

Note however that the base environment in Anaconda is very large and if you are installing a package that has a lot of dependencies, numerous conflicts may arise and the conda package manager may get stuck on solving the environment.

In such a case, it may be better to create a conda environment. A conda-environment can be thought of as a sub-installation that can be installed separately alongside your base installation.

The latest versions of the Spyder and JupyterLab IDEs available on the conda-forge channel are examples of Python packages with a large number of dependencies and should be installed in their own environments when possible.

The conda commands to create these are available below for advanced users. This guide will later explain these commands in more detail:

conda env list
conda remove -n spyder5
conda remove -n jupyterlab3
conda create -n spyder
conda activate spyder
conda install -c conda-forge spyder=5.2.1
conda install -c conda-forge cython seaborn sympy openpyxl xlrd xlsxwriter
spyder
conda create -n jupyterlab
conda activate jupyterlab
conda install -c conda-forge jupyterlab=3.2.5
conda install -c conda-forge cython seaborn sympy openpyxl xlrd xlsxwriter
conda install -c conda-forge nodejs ipywidgets jupyterlab-variableinspector ipympl plotly jupyterlab-drawio
jupyter-lab

Installing Anaconda

Uninstalling Old Versions

Skip this if this is a clean Installation of Ubuntu and no previous Anaconda, Miniconda have previously been installed.

Make sure all programs are closed:

Open up Files:

Select Settings and then Show Hidden Files:

Delete the anaconda3 folder and the following folders which contain configuration files .anaconda, .conda, .continuum, .ipython, .jupyter and .condarc.

Now go to the .config folder:

Delete matplotlib and spyder-py3:

Go to .local:

Then share:

Delete the jupyter, Spyder (and kite if present) folders:

Open the .bashrc file in text editor:

Delete the following lines:

Anaconda is now uninstalled and previous configuration files are purged. You may now Hide the Hidden Files:

Installing Anaconda

The Anaconda Individual installer or Miniconda installer can be downloaded from the Anaconda website:

Select the Linux installer.

Then select 64 Bit Installer:

Select Save File and then OK:

Wait for the Download to complete:

Go to the Downloads folder and right click it and select Open in Terminal:

Right click the .sh file and select rename:

Select the file name including the extension and select copy:

Type in:

bash install Anaconda3-2021.11-Linux-x86_64.sh

Hold down to scroll through the license agreement:

After you reach the end of the license agreement, you will need to accept it. To accept it type in:

yes

Anaconda will install in the default location:

home/philip\anaconda3

Press to select this default location:

Note File Explorer will automatically open to home for your user profile, so you will see the anaconda3 subfolder.

You will now be asked if you want to initialize Anaconda3 which will update your .bashrc to include the conda commands. This will allow you to use the conda commands in your terminal. Your .bashrc file is hidden, to view it select File Explorer Settings and Show Hidden Files:

Type in:

yes

Anaconda will now be installed:

The .bashrc file will now contain the conda command. The terminal must now be closed and reopened. During launch it will examine the .bashrc and now it will include the conda commands:

Notice each prompt begins with (base) which means the base conda env is selected:

If your terminal does not have a prompt beginning with (base) then it is likely that during installation you pressed at the prompt to run conda init then no (unfortunately the default option) was automatically highlighted and a notification (that is also pretty easy to skim past) at the end of the installed stated:

You have chosen to not have conda modify your shell scripts at all. To activate conda's base environment in your current shell session:

eval "$(/home/philip/anaconda3/bin/conda shell.YOUR_SHELL_NAME hook)"

This means your .bash.rc file lacks the conda commands and therefore your terminal will not be able to use conda commands although Anaconda is installed. This is one of the most common installations issue of Anaconda on Linux.

To rectify this open up your .bashrc file in a text editor and copy and paste the following lines of code at the bottom of the file:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/philip/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/home/philip/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/home/philip/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/home/philip/anaconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Replace my username philip with your username (4 occurrences).

Save the updated .bashrc file.

Close any open terminals and then open up a new terminal. It should now begin a prompt with (base).

The Anaconda Navigator

On Linux there are no Start Menu shortcuts for Anaconda. The Anaconda Navigator must be launched via the Terminal. To launch it type in:

anaconda-navigator

The Anaconda has some Settings which can be accessed using Files→Preferences:

If the menu displays you can use the GUI to change the settings and select Apply. On some systems if Enable DPI Scaling is Enabled, this window may be too big and you will need to manually change the anaconda-navigator.ini file:

This is found in the .anaconda subfolder:

Then in the navigator subfolder:

And can be opened with text editor:

Anaconda Navigator can be launched the Spyder and JupyterLab IDEs:

There is also some limited capabilities when it comes viewing your conda environments and updating some packages. This is generally better done via the terminal:

JupyterLab Preferences

When JupyterLab is launched, You will likely get:

Access to the file was denied

The file at .local/share/jupyter/runtime/jpserver-22284-open.html is not readable.

It may have been removed, moved, or file permissions may be preventing access.

The reason for this is that JupyterLab launches in the browser and all browsers in Ubuntu 20.04 LTS and newer are now installed by default as snap applications. or security purposes snap applications have been sandboxed to prevent hem accessing or changing configuration files.

In order to use JupyterLab we need to change its configurations settings doesn't look for this file. Go to the .jupyter folder.

Now in the terminal type in:

jupyter notebook --generate-config

The jupyter_notebook_config.py file will appear in your folder. Select Open with Text Editor:

Press Ctrl + f to find the search query c.NotebookApp.use_redirect_file (line 543 in my case) and uncomment out this line by removing the # and assign the value to False

c.NotebookApp.use_redirect_file = False

Save the file:

JupyterLab should now launch:

JupyterLab generally works a bit better in a Chromium based browser, so we can search for c.NotebookApp.browser (line 157 in my case) and uncomment it out and add the desired default browser e.g. Chromium into the str:

c.NotebookApp.browser = 'Chromium'

Save the file:

JupyterLab should now launch in Chromium:

The Terminal

The terminal can be used to launch Spyder by typing in the command:

spyder

And JupyterLab by using:

jupyter-lab

Oddly this is the only place where there is a – between jupyter and lab and it won't work without it.

Note the Terminal will be busy if it is used to run Anaconda-Navigator, Spyder or JupyterLab and closing down the terminal window will end the application.

In the case of JupyterLab when the browser tab is closed. The terminal will still be busy. You will need to cancel the current operation running int he terminal. To do this highlight the terminal window and press Ctrl + c:

Input:

y

You should now get a new prompt, allowing you begin inputting a new command:

The conda Package Manager

The terminal also can use the command conda which allows one to use the conda package manager to manipulate the packages installed in their (base) conda environment or alternatively to create a separate environment:

You will see the following positional arguments:

  • list
  • search
  • update
  • install
  • uninstall

As well as config to configure the condarc file and create to create a new conda environment:

conda list

List will list all of the packages installed in your conda environment:

conda list

The (base) conda environment is quite extensive:

The module name and module version are displayed for each module. Before looking at these in more detail, it is worth manually examining the folder structure in Files. In:

anaconda3

There is a lib folder:

Which contains a python version folder:

Within this folder are inbuilt modules. They can either be a single module file or a package which is essentially a folder of module files. For example there is the datetime.py and we have seen before that we can open .py files in a text editor:

When we use:

import datetime

We are importing the single script file datetime.py and can access its contents using a dot:

import datetime
datetime.

email is a package with multiple modules:

One of these script files is called __init__.py and is referenced when we import the package using the name of the folder:

import email

Notice that the other script files (known as modules) available in this folder are accessible from this with a dot:

import email
email.

Most the contents in the lib folder are Python Standard Libraries (inbuilt into Python). Details about these are available in the Python Module Index:

Note some of the packages such as math are written in C or C++ and there is no .py file available.

Don't get overwhelmed, when beginning you should learn the Python Programming language and only a handful of the more useful standard packages such as datetime and math. Then only the additional packages specialised towards your task.

Third-party packages are installed in the site-packages subfolder:

anaconda3\Lib\python3.9\site-packages

This is essentially where the Anaconda (base) environment adds additional functionality. There is normally a folder stating the version of the package and then a folder containing the script files, including the __init__.py script file.

There are a handful of DataScience Packages that are extremely widely used such as numpy, matplotlib, pandas and seaborn which are typically imported using a 2-3 digit character alias:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In the case of matplotlib, the __init__.py isn't referenced and the pyplot module is instead directly imported.

Note however despite their widespread use, these are not "Python Standard Libraries" and therefore not in the Python Module Index (mentioned above).

Note code-completion often works slightly better with the standard libraries than third-party libraries. For a third-party module, the module needs to be imported into the actively running Kernel in order for code-completion options to show. In JupyterLab, this is typically done by running the imports in an earlier cell and in Spyder, the run line or run cell option can be used to do this:

conda update

The package name can be used with search, update, install and uninstall positional arguments. In the case of update, the flag –all can be used to update every package in the entire base environment to the latest version in the conda channel:

conda update --all

The conda package manager will attempt to solve your conda environment, that is look for any incompatibilities, then inform you of the proposed changes:

Input:

y

to proceed.

You may want to repeat the check for updates until you are informed all updates are complete.

We can search for a conda package by using search followed by the package name. For example spyder:

conda search spyder

This will give you the latest version of Spyder on the conda channel.

There are two mainstream channels used with the, the conda channel and the conda-forge channel. The conda channel is the channel maintained by the Anaconda company and the second is the community channel maintained directly by the software developers.

As a rule of thumb, newer versions of software packages are usually on the conda-forge channel and some packages are only on the conda-forge channel. However there are some packages that were previously maintained by software developers and are now maintained directly by the Anaconda company. The channel to use in the search can be selected using the flag -c followed by the channel name. As a rule of thumb it good to search both channels:

conda search -c conda spyder
conda search -c conda-forge spyder

In this example we see a newer version of Spyder:

Let's search for a smaller package called python-docx:

conda search python-docx
conda search -c conda-forge python-docx

conda install

We can install a package into our base environment using the command:

conda install -c conda-forge python-docx

This will install the latest version of python-docx from the channel conda-forge.

Once again you will be informed of the changes, input:

y

to proceed.

The changes are now made:

The python-docx folder is now in site-packages:

Sometimes the condarc file is set to prefer the conda channel even if it has an older version than the conda-forge channel. For example in the case of Spyder:

conda install -c conda-forge spyder

In such cases you can override this by specifying the version to install:

conda install -c conda-forge spyder=5.2.1

Some packages, for example the Spyder IDE have a large number of dependencies and therefore it will take the conda package manager a long time to solve this environment (sometimes forever if the changes are too complex).

It is generally better to instead install these in a new conda environment which will be discussed later:

I will cancel the changes:

n

conda uninstall

If I wanted to uninstall a package, for example python-docx I could use:

conda uninstall python-docx

revision

Every time a change is made by the conda package manager, a new revision is made with a numeric number. The packages installed in each revision can be listed using:

conda list --revision

The revision can be installed using:

conda install --revision 0

The pip (Python Install Package) Package Manager

pip can be used to install packages but using pip does not perform all the checks to solve the conda environment and is therefore more likely to result in a broken conda environment. It should only be used in the rare case where the package is not on the conda or conda-forge channels:

pip install python-docx

The Spyder 5 IDE

Spyder 5 is one of the best IDEs for learning Python and one of the most popular for DataScience.

Spyder Preferences

The Spyder Preference can be altered by going to Tools and Preferences:

In the Appearance Tab, the syntax highlighting theme can be changed from Spyder Dark to Spyder:

In the Editor, Indent Guides and Blank Spaces can be shown:

Select Apply and then Yes:

Spyder will restart using the Spyder Syntax highlighting scheme:

File Menu

Spyder has a file menu to save and open script files. Each script displays in its own tab. The current directory (the directory the last script is run) shows at the top:

We can save our script file in Documents for example:

In this case I will call it spyder_script

Now it displays at the top and changes aren't saved so it is indicated with a *

Syntax Highlighting

Syntax highlighting is carried out by default. Numeric values on line 2 and 3 are highlighted in brown. strs on line 7, 14, 15 and 16 are highlighted in green.

Note the matching bracket for the bracket selected on line 16 is highlighted in line 14.

There is a typo in the code and this is marked by an x on line 16 as the variable boll_num isn't defined:

Once this is fixed, there is no error:

Run Script

We can put some test code to create fundamental numeric variables and text variables. We can also create collections using the inbuilt classes:

#%% Fundamental Numeric Datatypes
full_num = 5
dec_num = 10.5
bool_num = True

#%% String
string = "Hello"

#%% Collections
list_col = [full_num, dec_num, bool_num, string]
tuple_col = (full_num, dec_num, bool_num, string)
dict_col = {"full": full_num, "dec": dec_num, "bool": "bool_num", "string": string}

Upon first launch we are prompted for the run settings which we can leave as default (these can later be changed in preferences if desired):

Variable Explorer

Spyder has a Variable Explorer which can be used to explore these variables. Each variable type is listed, alongside its size. In the case of a string, this is the number of characters and in the case of a list, this is the number of items in the list. Collections of variables for example this dict can be expanded to view in more detail:

Script Editor and Console

Spyder has a script editor and a console. The console keeps a track of the number of executions sent to the kernel. For example if the script is run to create the variable full_num, the value of full_num is shown in the variable explorer and we are informed in the console that we have run the script file:

full_num = 5

If we modify the script to print full_num and run it again:

full_num = 5
print(full_num)

The number of execution is now 2 and we are informed the script is run. The value 5 also displays below this as we used the print function:

Operations can be carried out in the console. For example this single line operation was the third execution to take place:

The console is often used to test out a quick line or two of code before adding it to a script.

Kernel

Restarting the Kernel will clear all variables from the Variable Explorer, clear the Console and close any imported modules. The number of executions will return back to 0.

The Kernel can be Restarted by going to Consoles → Restart Kernel:

Select Yes to Proceed:

The Kernel is now clear:

Cells

Comments can be added to the script by beginning with #. If a line begins with #%% it will create a new cell and the currently selected cell is highlighted in yellow.

We can run a single cell by selecting run cell:

Notice only the variables defined in the first cell display in the variable explorer and the first cell is still highlighted:

We can restart the Kernel and instead, select the next button. Run the cell and move onto the next cell:

Notice how the second cell is highlighted after the first cell is executed:

Finally we can use the 4th Run button to run only the highlighted selection:

Importing DataScience Libraries

Supposing we want to record some dependent y data values with respect to independent x values. We could use two lists to create two equally sized numeric vectors, a nested list or a dictionary:

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

xy = [[1, 2, 3, 4, 5], [2, 4, 6, 8, 10]]

xy2 = {"x": [1, 2, 3, 4, 5], "y": [2, 4, 6, 8, 10]}

Note however that each of these objects is 1 dimensional. i.e. xy is a list of lists.

Moreover the datatype of each item in the list can be independent which offers the most flexibility but it is not particularly useful in some cases where one is trying to plot the data to see a trend for example.

Finally the operators available for a collection such as a list are not optimised for numeric data. The + operator for example will concatenate a two lists, making a longer list, in a similar manner to the + operator being setup between two strings. It is not setup like the + operator between two ints which perform numeric addition.

We have two Python libraries based upon additional datatypes, numpy which is based numeric python arrays (which can be visualised as a mathematical matrix) and pandas which is based upon a DataFrame (which can be visualised as an Excel spreadsheet). These datatypes have a number of methods and operators, for example in the case of a numeric array, will carry out matrix operations.

Let's first examine numpy. Note the import line should be highlighted and ran which will execute it as shown in the console:

As numpy is imported into the kernel, code-completion for numpy will be accessible. In the case of np. a selection of objects which can be called from the numpy library:

We can use the function array to create a new numpy array. When this function is typed with parenthesis, details about the functions input arguments are shown.

Spyder has a Help Pane. Highlighting a function or class and pressing Ctrl + i will inspect it:

And attempt to retrieve the documentation:

For functions or classes from the DataScience libraries this can sometimes be quite limited and only give details about the library and not the specific function or class:

Usually a more detailed docstring can be accessed directly from the console by typing in a ? followed by the function or class to be investigated:

? np.array

Use the mouse wheel to scroll through the documentation and press q to quit the pager:

This will take you to the next line in the console:

We need the object which is usually a list or a list of lists. Everything else shown is optional and the datatype will be automatically determined:

import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
xy = np.array([[1, 2, 3, 4, 5], [2, 4, 6, 8, 10]])
xy2 = np.transpose(xy)

If we run the code above we can see all the objects in the variable explorer. Since the datatype is constant for each cell in a numpy array, this is just shown on the Variable Explorer:

We can now look at the pandas. Once again the import line should be ran in order to allow the code-completion to work:

In pandas we use the DataFrame (CamelCaseCapitalization) class to create an instance (our variable name xy):

We need to supply the data. This is provided in the form of a dictionary where they keys are strings of the column names and the values are lists. Notice how the dataframe xy looks like a matrix but each column is clearly labelled as "x" and "y" respectively:

Files

Notice the dataframe is in the form of an excel sheet and closely resembles the one below which is saved in Documents as Book1.xlsx:

We can use the read_excel function to read this data and create a new instance of a dataframe.

This Excel File has the title "Book1.xlsx" and is in the same folder as the spyder_script.py file. These can be seen by using the Files Tab. Each column has a name and the default Sheet name Sheet1 is used. Therefore we don't need to override the default values of any of the keyword input arguments in the read_excel function.

This reads in the data as a dataframe:

Plotting

Now we've got data, we can have a look at plotting it. There are a number of Python plotting libraries. The most frequently used one is matplotlib. It is frequently used with seaborn which acts like a wrapper around matplotlib to consistently change the styles of the plots and add some additional plot types (commonly used in data science).

We can now look at matplotlib. Once again the import line should be ran in order to allow the code-completion to work:

We can use the function plot to create a basic line plot:

For the args, we need to provide the x and y data. We can access a column from a dataframe as an attribute:

import pandas as pd
import matplotlib.pyplot as ply

xy = pd.read_excel("Book1.xlsx")

plt.plot(xy.x, xy.y)

Spyder by default, displays the plots as inline in the plots pane:

To change this and instead use automatic plotting to create each plot in its dedicated window select Tools → Preferences:

Select IPython Console. Then to the left select Graphics and change the setting to Automatic. The select Apply and Restart the Kernel:

Now relaunching the script shows an Automatic plot in its own dedicated Window:

We can also import seaborn and run the selection to allow code-completion to take place:

We can then change the style of the plots using set_style:

In this example, I am using whitegrid:

seaborn includes additional plot types that are used particularly in science. Some of these are duplicates with matplotlib but the syntax is a bit more geared towards dataframes.

We can use the function figure to create a new plot and assign the figure number. We will use figure 1 to create a line plot using matplotlib and figure 2 to create a lineplot with seaborn:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")

xy = pd.read_excel("Book1.xlsx")

plt.figure(1)
plt.plot(xy.x, xy.y)
plt.xlabel("x (units)")
plt.ylabel("y (units)")

plt.figure(2)
sns.lineplot(data=xy, x="x", y="y")
plt.xlabel("x (units)")
plt.ylabel("y (units)")

The JupyterLab 3 IDE

Recall that there is no start menu shortcut to JupyterLab and it needs to be launched via the Anaconda Navigator or Anaconda PowerShell Prompt.

File Explorer

JupyterLab is browser based. To the left handside is the file explorer alongside a File Menu which can be used to save files:

To the right hand side is the launcher. There are three common used files, the text file, markdown file and Notebook file:

Text File

The text file is essentially a plain text file and has the same capabilities of notepad. i.e. you can write text with no formatting capabilities:

The file can be renamed,by renaming the tab to the top or the file name in the JupyterLab file explorer to the left hand side:

In this case, I can rename it as textfile.txt:

This file can be viewed in File Explorer and opened in Text Editor:

Markdown File

We can use the + button in the top if the JupyterLab file explorer to open a new Launcher as a new tab to the right hand side.

We can then create a new Markdown file:

We can rename it, similar to the text file:

Markdown has basic formatting capabilities. We can see the Markdown Preview by right clicking on some blank space and selecting Show Markdown Preview:

Headings

We can use a series of # to markdown the heading level:

# Heading 1
## Heading 2
### Heading 3
#### Heading 4

JupyterLab has a navigation pane and the headings will auto-populate this as they are created.

Formatted Text

We can enclose text in * or ~ to format it. One set of * makes text italic, two sets of * makes it bold and three sets of * makes it bold-italic. Two sets of ~ make it strike-through:

Let's make a sentence with *italic text*, **bold text**, ***bold-italic*** and ~~strike-through~~ text.

Escape Characters

When we don't one to use one of the formatting characters to format the text but rather include it in the text, we can prepend it with \ to insert an escape character. If we want to insert \ we use \\ where the first \ denotes insertion of an escape character and the second \ denotes the escape character to be inserted is \

Let's make a sentence with \*italic text\*, \*\*bold text\*\*, \*\*\*bold-italic\*\*\* and \~\~strike-through\~\~ text.

\\

Spacing

For convenience a long sentence written over multiple lines is formatted as a single sentence:

She sells
seashells
on the
seashore

If we want to deliberately separate it out into different lines, we must doubly space it:

She sells

seashells

on the

seashore

Bullet Points

We can easily create bullet points by prepending each line with * or 1., 2. and so on.

Bullet Point List:

* one
* two
* three

Bullet Point List (spaced):

* one

* two

* three

Numeric List:

1. one
2. two
3. three

Numeric List (spaced):

1. one

2. two

3. three

Tables

We can use the pipe to create a table, row by row. The first row is the column names and the second row is the column formats (which can be changed to normal, left aligned, right aligned, left-aligned with title centred):

|num|number|
|---|---|
|1|one|
|2|two|
|3|three|
|num|number|
|---|:-|
|1|one|
|2|two|
|3|three|
|num|number|
|---|-:|
|1|one|
|2|two|
|3|three|
|num|number|
|---|:-:|
|1|one|
|2|two|
|3|three|

We can create a link by enclosing the link name in square brackets [ ] followed by the link enclosed in ( ) brackets. If the link is an image it can be displayed by prepending with !

[Anaconda Individual Edition](https://www.anaconda.com/products/individual)
[Spyder Image Wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/Spyder-windows-screenshot.png/300px-Spyder-windows-screenshot.png)
![Spyder Image Wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/Spyder-windows-screenshot.png/300px-Spyder-windows-screenshot.png)
![Pic to Display](pic_to_be_displayed_in_same_folder_as_markdown_file.png)

Code

If we want to include code we can begin and end the code with 3 back-quotes ` and we can use "` on a new line to begin and end a code-block:

The first line of code is ```print("Hello World")```.

The code block is:
```
print("Hello World")
print("Goodbye World")
```
It prints.

Equations (Latex)

$ can be used to enclose an inline Latex equation. 2 $ can be used for a display equation.

The inline equation is $ $

The display equation is:
$$ $$

Use of latex is out of scope for this guide. However I will demonstrate using an equation created in equalx. To install equalx open a new terminal window and use:

sudo apt-get equalx

Then authenticate using your password in order to proceed. To launch equalx, in a new terminal window, type:

equalx

Notebook File

Now that we have an overview of markdown we can now create a JupyterLab notebook file.

Markdown and Code Cells

Unlike Spyder where the use of Code Cells were optional, in a JupyterLab notebook file Cells are mandatory. Each Cell can be ran as either a Code Cell or Markdown Cell.

In a markdown cell, we can use markdown syntax:

When we run the cell it will show the equivalent as seen in the markdown preview:

We can type in Python code in an almost identical way to how we would input it in Spyder. In this case we are going to import our DataScience libraries (so we can use them and their code-completion in later cells):

Note the number 1 displays which indicates it is the first cell run on this current kernel:

We can now create another markdown cell and run it as before:

Code Completion

If we type in an object such as an imported library followed by a dot . and then tab we can view the objects we can use from it:

When this second code cell is ran, it has the number 2, indicating it is the 2nd code block run on the current kernel:

To get details about the input arguments for example in a function or class. We can type the function name or class name with open parenthesis and then press shift and tab together to display the documentation as a pop-up balloon:

We can also append the object we are interested in with a ? and display its docstring in the cell output:

Cell Output

If the docstring is long, we can right click the cell and optionally select Enable Scrolling for Output:

Note the cell is the third code cell that is run in the active kernel. If we change its contents to markdown and run it again. The number at the side of the cell starts:

We can use the read_excel function that we used before to read a file in the same folder as the notebook file:

Note that the cell number is now 4 (cell 3 has been modified to markdown but it was still run in the past so this cell displays 4). The contents of the cell have been assigned to a variable and therefore there is no output to the cell:

If we remove the assignment to the left hand side and rerun the cell, we will instead see the dataframe in the cell output. Because this cell is rerun, it now displays 5.

If we assign it instead to the variable data we can see the cell output is once again empty:

If we type data in another cell, we will view the contents of data as the cell output isn't assigned to a new variable name. Alternatively we can use the print statement in the cell which we create the variable data in to view it in the cell output:

Plotting

We can create a plot using almost identical code to in Spyder. By default inline plotting is used and the plot is shown as a static image in the cell output: