Python on HPC

The following is meant to be a brief guide to running Python jobs on HPC. For more details, please see the New User Guide or Running a job on HPC using Slurm.

HPC maintains multiple versions of Python and Anaconda in /usr/usc/python. The following directions will always refer to a version of Python under /usr/usc/python — and not to a version in /usr/bin/python (if it exists) that may be installed as part of the operating system.

Managing Python Packages

Before you run Python packages on HPC, follow the steps in this section to learn how to check which packages are currently installed, create storage space for your Python packages, and how to share packages among project members.

Pre-Installed Packages

HPC installs a number of distributed-computing-related packages when it installs a new version of Python. The packages can vary for each version. For version 3.6.0, the pip3 list command lists the following global packages and dependencies: numpy, scipy, matplotlib, openpyxl, pandas, scikit-image, scikit-learn, pillow, python-igraph, mpi4py, Rpy2, Yapsy, ipython, theano, opencv-python, pycuda, keras, Cython, sparsehash, wheel, and pycairo.

Storing Python Packages

HPC researchers are also encouraged to install their own Python packages on HPC (or upgrade those that were pre-installed). By default, Python will install local (i.e., user) packages in your home directory, in the subdirectory named .local (the dot in front is part of the name and is required, (e.g., ~/.local, /home/rcf-40/ttrojan/.local). Python will create this directory if it does not already exist.

Initialization Step

To avoid filling up the limited disk space in your home directory, you must perform a one-time initialization step to change the installation location for Python packages. First, create a new Python_packages directory in your project directory (in the example below, this is done from the home directory). Then, create a symbolic link to your new package directory and name it .local.

cd ~
mkdir /home/rcf-proj/<project>/<username>/Python_packages
ln -s /home/rcf-proj/<project>/<username>/Python_packages .local

Where <project> is your project name and <username> is your username.

A symbolic link appears to be identical to the file or directory it links to. You can see that it is actually a link by typing ls -la. (The ‘a’ is necessary because “dot” files are hidden from regular listings.)

$ ls -la .local
lrwxr-xr-x  1 ttrojan lc_tt1  Apr 10 .local -> /home/rcf-proj/tt1/ttrojan/Python_packages/

When you install packages, Python will still place them in ~/.local by default and the symlink will reroute the files to the Python_packages/ directory in your project space.

Sharing Python Packages

Some research groups may find it convenient to use a shared package directory so that all members can use the exact same packages and conserve their shared disk quota. If your research group wishes to do this, you can create a “Python_packages” directory in the group’s project directory. Each member must then create their own symlink to this directory. Keep in mind that permissions must be set so that the entire group has at least “read” permissions for the group’s directory. Those who will be installing/upgrading packages will also need “write” permissions. NOTE: ~/.local can only be symbolically linked to one directory.

Installing/Upgrading Packages

Once you complete the initial steps above, you can install Python packages. First, configure your runtime environment by sourcing the setup.sh file for the version of Python you want to use. Once you’ve sourced the setup file, use Python’s package installer (PIP) to install packages from the command line.

To perform a new user (local) install of the Python package pytest:

$ source /usr/usc/python/3.6.0/setup.sh
$ pip3 install pytest --user 

To upgrade the currently-installed Python package pytest:

$ pip3 install pytest --user --upgrade

To see a list of all installed packages and their current and latest versions:

$ pip3 list -o --format columns
Package           Version    Latest      Type 
----------------- ---------- ----------- -----
h5py              2.8.0      2.9.0       wheel
mpi4py            2.0.0      3.0.0       sdist
  :

NOTE: Python versions 3.X and 2.X have differently named binaries. To invoke Python and pip for Python 3.X, use “python3” and “pip3”; for Python 2.X, simply use “python” and “pip”.

Running Python

Running Python Interactively

It is a good idea to test your Python program on an interactive compute node before submitting a batch (remote) job. The Slurm command salloc will request a compute node with 8 CPUs, each with 2GB of memory, for 1 hour and, when the resource is allocated, log you into the node.

[ttrojan@hpc3676]$ salloc --ntasks=8 --mem-per-cpu=2g --time=1:00:00
salloc: Pending job allocation 2377051
salloc: job 2377051 queued and waiting for resources
salloc: job 2377051 has been allocated resources
salloc: Granted job allocation 2377051
salloc: Waiting for resource configuration
salloc: Nodes hpc3676 are ready for job
---------- Begin SLURM Prolog ----------
Job ID:        2377051
Username:      ttrojan
Accountname:   lc_tt1
Name:          sh
Partition:     quick
Nodelist:      hpc3676
TasksPerNode:  8
CPUsPerTask:   Default[1]
TMPDIR:        /tmp/2377051.quick
SCRATCHDIR:    /staging/scratch/2377051
Cluster:       uschpc
HSDA Account:  false
---------- 2018-12-12 17:59:36 ---------
[ttrojan@hpc3676]$  

Once you are on a compute node, select the version of Python you wish to run from /usr/usc/python. Now load, or source, the setup script for your selected version to configure your current environment to find and use that version of Python and pip. You can run your program on the command line with the command python3.

$ source /usr/usc/python/3.6.0/setup.sh
$ python3 hello.py
Hello Tommy

NOTE: This above script assumes you are running in a bash shell. For a (t)csh shell, source setup.csh.

Alternatively, you can run your program within Python’s interpreter. You have to explicitly specify the path of your program.

$ python3
Python 3.6.0 (default, Feb 17 2017, 15:36:40) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.

# If code in same directory where Python was invoked
>>> exec(open('./hello.py').read())
Hello Tommy

# If not, use an absolute path
>>> exec(open('/home/rcf-proj/tt1/ttrojan/python/hello.py').read())
Hello Tommy

Running Python Remotely

Once you are confident that your program will finish without your intervention, you are ready to run it remotely as a batch (non-interactive) job. Batch jobs are submitted to a job scheduler using a text file called a job script, in which you specify the compute resources and commands needed to run your job.

The following job script, myjob.slurm, will request 16 CPUs on a single compute node, each with 2 gigabytes of memory, for 4 hours. When the job starts, it will then set up Python 3.6.0 and run myprogram.py.

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=2GB
#SBATCH --time=4:00:00
#SBATCH --export=none # Ensures job gets a fresh login environment

source /usr/usc/python/3.6.0/setup.sh
python3 myprogram.py

You can now submit your job for remote processing using Slurm’s sbatch jobscript command.

$ sbatch myjob.slurm
Submitted batch job 1131075

To check on the status of your job use the command squeue -u username. If you are on a head node, you can use the HPC squeue wrapper myqueue.

$ squeue -u ttrojan
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
1131095     quick example_py     ttrojan PD       0:00      2 (Resources)

$ myqueue
JOBID    USER  ACCOUNT  PARTITION  NAME             TASKS  CPUS_PER_TASK  MIN_MEMORY  START_TIME           TIME  TIME_LIMIT  STATE    NODELIST(REASON)
1131075  ttrojan  lc_tt1  quick      example_R  16     1              1G          2018-07-02T11:14:46  1:49  30:00       RUNNING  hpc1046

By default, all output sent to the console, including error messages and print statements, is directed to a file named “slurm-%j.out,” where the “%j” is replaced with the job ID number. The file will be generated on the first node of the job allocation.

Any files created by the Python program itself will be created as specified by the program.

Getting Help

For assistance with running Python on HPC, see our Getting Help page or send an email to hpc@usc.edu.