Python on HPC
HPC maintains multiple versions of Python and Anaconda in /usr/usc/python. The following directions will always refer to a version of Python under /usr/usc/python — and not to a version in /usr/bin/python (if it exists) that may be installed as part of the operating system.
Managing Python Packages
Before you run Python packages on HPC, follow the steps in this section to learn how to check which packages are currently installed, create storage space for your Python packages, and how to share packages among project members.
HPC installs a number of distributed-computing-related packages when it installs a new version of Python. The packages can vary for each version. For version 3.6.0, the pip3 list command lists the following global packages and dependencies: numpy, scipy, matplotlib, openpyxl, pandas, scikit-image, scikit-learn, pillow, python-igraph, mpi4py, Rpy2, Yapsy, ipython, theano, opencv-python, pycuda, keras, Cython, sparsehash, wheel, and pycairo.
Storing Python Packages
HPC researchers are also encouraged to install their own Python packages on HPC (or upgrade those that were pre-installed). By default, Python will install local (i.e., user) packages in your home directory, in the subdirectory named .local (the dot in front is part of the name and is required, (e.g., ~/.local, /home/rcf-40/ttrojan/.local). Python will create this directory if it does not already exist.
To avoid filling up the limited disk space in your home directory, you must perform a one-time initialization step to change the installation location for Python packages. First, create a new Python_packages directory in your project directory (in the example below, this is done from the home directory). Then, create a symbolic link to your new package directory and name it .local.
cd ~ mkdir /home/rcf-proj/<project>/<username>/Python_packages ln -s /home/rcf-proj/<project>/<username>/Python_packages .local
Where <project> is your project name and <username> is your username.
A symbolic link appears to be identical to the file or directory it links to. You can see that it is actually a link by typing ls -la. (The ‘a’ is necessary because “dot” files are hidden from regular listings.)
$ ls -la .local lrwxr-xr-x 1 ttrojan lc_tt1 Apr 10 .local -> /home/rcf-proj/tt1/ttrojan/Python_packages/
When you install packages, Python will still place them in ~/.local by default and the symlink will reroute the files to the Python_packages/ directory in your project space.
Sharing Python Packages
Some research groups may find it convenient to use a shared package directory so that all members can use the exact same packages and conserve their shared disk quota. If your research group wishes to do this, you can create a “Python_packages” directory in the group’s project directory. Each member must then create their own symlink to this directory. Keep in mind that permissions must be set so that the entire group has at least “read” permissions for the group’s directory. Those who will be installing/upgrading packages will also need “write” permissions. NOTE: ~/.local can only be symbolically linked to one directory.
Once you complete the initial steps above, you can install Python packages. First, configure your runtime environment by sourcing the setup.sh file for the version of Python you want to use. Once you’ve sourced the setup file, use Python’s package installer (PIP) to install packages from the command line.
To perform a new user (local) install of the Python package pytest:
$ source /usr/usc/python/3.6.0/setup.sh $ pip3 install pytest --user
To upgrade the currently-installed Python package pytest:
$ pip3 install pytest --user --upgrade
To see a list of all installed packages and their current and latest versions:
$ pip3 list -o --format columns Package Version Latest Type ----------------- ---------- ----------- ----- h5py 2.8.0 2.9.0 wheel mpi4py 2.0.0 3.0.0 sdist :
NOTE: Python versions 3.X and 2.X have differently named binaries. To invoke Python and pip for Python 3.X, use “python3” and “pip3”; for Python 2.X, simply use “python” and “pip”.
Running Python Interactively
It is a good idea to test your Python program on an interactive compute node before submitting a batch (remote) job. The Slurm command salloc will request a compute node with 8 CPUs, each with 2GB of memory, for 1 hour and, when the resource is allocated, log you into the node.
[ttrojan@hpc3676]$ salloc --ntasks=8 --mem-per-cpu=2g --time=1:00:00 salloc: Pending job allocation 2377051 salloc: job 2377051 queued and waiting for resources salloc: job 2377051 has been allocated resources salloc: Granted job allocation 2377051 salloc: Waiting for resource configuration salloc: Nodes hpc3676 are ready for job ---------- Begin SLURM Prolog ---------- Job ID: 2377051 Username: ttrojan Accountname: lc_tt1 Name: sh Partition: quick Nodelist: hpc3676 TasksPerNode: 8 CPUsPerTask: Default TMPDIR: /tmp/2377051.quick SCRATCHDIR: /staging/scratch/2377051 Cluster: uschpc HSDA Account: false ---------- 2018-12-12 17:59:36 --------- [ttrojan@hpc3676]$
Once you are on a compute node, select the version of Python you wish to run from /usr/usc/python. Now load, or source, the setup script for your selected version to configure your current environment to find and use that version of Python and pip. You can run your program on the command line with the command python3.
$ source /usr/usc/python/3.6.0/setup.sh $ python3 hello.py Hello Tommy
NOTE: This above script assumes you are running in a bash shell. For a (t)csh shell, source setup.csh.
Alternatively, you can run your program within Python’s interpreter. You have to explicitly specify the path of your program.
$ python3 Python 3.6.0 (default, Feb 17 2017, 15:36:40) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux Type "help", "copyright", "credits" or "license" for more information. # If code in same directory where Python was invoked >>> exec(open('./hello.py').read()) Hello Tommy # If not, use an absolute path >>> exec(open('/home/rcf-proj/tt1/ttrojan/python/hello.py').read()) Hello Tommy
Running Python Remotely
Once you are confident that your program will finish without your intervention, you are ready to run it remotely as a batch (non-interactive) job. Batch jobs are submitted to a job scheduler using a text file called a job script, in which you specify the compute resources and commands needed to run your job.
The following job script, myjob.slurm, will request 16 CPUs on a single compute node, each with 2 gigabytes of memory, for 4 hours. When the job starts, it will then set up Python 3.6.0 and run myprogram.py.
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --cpus-per-task=16 #SBATCH --mem-per-cpu=2GB #SBATCH --time=4:00:00 #SBATCH --export=none # Ensures job gets a fresh login environment source /usr/usc/python/3.6.0/setup.sh python3 myprogram.py
You can now submit your job for remote processing using Slurm’s sbatch jobscript command.
$ sbatch myjob.slurm Submitted batch job 1131075
To check on the status of your job use the command squeue -u username. If you are on a head node, you can use the HPC squeue wrapper myqueue.
$ squeue -u ttrojan JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1131095 quick example_py ttrojan PD 0:00 2 (Resources) $ myqueue JOBID USER ACCOUNT PARTITION NAME TASKS CPUS_PER_TASK MIN_MEMORY START_TIME TIME TIME_LIMIT STATE NODELIST(REASON) 1131075 ttrojan lc_tt1 quick example_R 16 1 1G 2018-07-02T11:14:46 1:49 30:00 RUNNING hpc1046
By default, all output sent to the console, including error messages and print statements, is directed to a file named “slurm-%j.out,” where the “%j” is replaced with the job ID number. The file will be generated on the first node of the job allocation.
Any files created by the Python program itself will be created as specified by the program.