Summary and Schedule

{% include gh_variables.html %}

This workshop is an introduction to using high-performance computing systems effectively. We can’t cover every case or give an exhaustive course on parallel programming in just two days’ teaching time. Instead, this workshop is intended to give students a good introduction and overview of the tools available and how to use them effectively.

Prerequisites

Command line experience is necessary for this lesson. We recommend the participants to go through shell-novice, if new to the command line (also known as terminal or shell).

By the end of this workshop, students will know how to:

Identify problems a cluster can help solve
Use the UNIX shell (also known as terminal or command line) to connect to a cluster.
Transfer files onto a cluster.
Submit and manage jobs on a cluster using a scheduler.
Observe the benefits and limitations of parallel execution.

Getting Started

To get started, follow the directions in the “Setup” tab to download data to your computer and follow any installation instructions.

Note that this is the draft HPC Carpentry release. Comments and feedback are welcome.

For Instructors

If you are teaching this lesson in a workshop, please see the Instructor notes.

Setup Instructions

Download files required for the lesson

00h 00m

1. Why use a Cluster?

Why would I be interested in High Performance Computing (HPC)?
What can I expect to learn from this course?

00h 20m

2. Connecting to a remote HPC system

How do I log in to a remote HPC system?

00h 55m

3. Exploring Remote Resources

How does my local computer compare to the remote systems?
How does the login node compare to the compute nodes?
Are all compute nodes alike?

01h 30m

4. EPCC version - Working on a remote HPC system

“What is an HPC system?”
“How does an HPC system work?”
“How do I log on to a remote HPC system?”
:::

02h 05m

5. Scheduler Fundamentals

What is a scheduler and why does a cluster need one?
How do I launch a program to run on a compute node in the cluster?
How do I capture the output of a program that is run on a node in the cluster?

03h 20m

6. HPCC version - Scheduler Fundamentals

What is a scheduler and why does a cluster need one?
How do I launch a program to run on a compute node in the cluster?
How do I capture the output of a program that is run on a node in the cluster?

04h 35m

7. EPCC version - Working with the scheduler

“What is a scheduler and why are they used?”
“How do I launch a program to run on any one node in the cluster?”
“How do I capture the output of a program that is run on a node in the
cluster?”
:::

05h 55m

8. Environment Variables

How are variables set and accessed in the Unix shell?
How can I use variables to change how a program runs?

06h 10m

9. Accessing software via Modules

How do we load and unload software packages?

06h 55m

10. Transferring files with remote computers

How do I transfer files to (and from) the cluster?

07h 25m

11. Running a parallel job

How do we execute a task in parallel?
What benefits arise from parallel execution?
What are the limits of gains from execution in parallel?

08h 55m

12. Using resources effectively

How can I review past jobs?
How can I use this knowledge to create a more accurate submission script?

09h 25m

13. Using shared resources responsibly

How can I be a responsible user?
How can I protect my data?
How can I best get large amounts of data off an HPC system?

09h 45m

Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

There are several pieces of software you will wish to install before the workshop. Though installation help will be provided at the workshop, we recommend that these tools are installed (or at least downloaded) beforehand.

A terminal application or command-line interface
A Secure Shell application

Bash and SSH

This lesson requires a terminal application (bash, zsh, or others) with the ability to securely connect to a remote machine (ssh).

Where to Type Commands: How to Open a New Shell

The shell is a program that enables us to send commands to the computer and receive output. It is also referred to as the terminal or command line.

Some computers include a default Unix Shell program. The steps below describe some methods for identifying and opening a Unix Shell program if you already have one installed. There are also options for identifying and downloading a Unix Shell program, a Linux/UNIX emulator, or a program to access a Unix Shell on a server.

Unix Shells on Windows

Computers with Windows operating systems do not automatically have a Unix Shell program installed. In this lesson, we encourage you to use an emulator included in Git for Windows, which gives you access to both Bash shell commands and Git. If you have attended a Software Carpentry workshop session, it is likely you have already received instructions on how to install Git for Windows.

Once installed, you can open a terminal by running the program Git Bash from the Windows start menu.

Shell Programs for Windows

Git for Windows – Recommended
Windows Subsystem for Linux – advanced option for Windows 10

Alternatives to Git for Windows

Other solutions are available for running Bash commands on Windows. There is now a Bash shell command-line tool available for Windows 10. Additionally, you can run Bash commands on a remote computer or server that already has a Unix Shell, from your Windows machine. This can usually be done through a Secure Shell (SSH) client. One such client available for free for Windows computers is PuTTY. See the reference below for information on installing and using PuTTY, using the Windows 10 command-line tool, or installing and using a Unix/Linux emulator.

For advanced users, you may choose one of the following alternatives:

Install the Windows Subsystem for Linux
Use the Windows PowerShell
Read up on Using a Unix/Linux emulator (Cygwin) or Secure Shell (SSH) client (PuTTY)

Warning

Commands in the Windows Subsystem for Linux (WSL), PowerShell, or Cygwin may differ slightly from those shown in the lesson or presented in the workshop. Please ask if you encounter such a mismatch – you’re probably not alone.

Unix Shell on macOS

On macOS, the default Unix Shell is accessible by running the Terminal program from the /Application/Utilities folder in Finder.

To open Terminal, try one or both of the following:

In Finder, select the Go menu, then select Utilities. Locate Terminal in the Utilities folder and open it.
Use the Mac ‘Spotlight’ computer search function. Search for: Terminal and press Return.

For an introduction, see How to Use Terminal on a Mac.

Unix Shell on Linux

On most versions of Linux, the default Unix Shell is accessible by running the (Gnome) Terminal or (KDE) Konsole or xterm, which can be found via the applications menu or the search bar.

Special Cases

If none of the options above address your circumstances, try an online search for: Unix shell [your operating system].

SSH for Secure Connections

All students should have an SSH client installed. SSH is a tool that allows us to connect to and use a remote computer as our own.

SSH for Windows

Git for Windows comes with SSH preinstalled: you do not have to do anything.

GUI Support for Windows

If you know that the software you will be running on the cluster requires a graphical user interface (a GUI window needs to open for the application to run properly), please install MobaXterm Home Edition.

SSH for macOS

macOS comes with SSH pre-installed: you do not have to do anything.

GUI Support for macOS

If you know that the software you will be running requires a graphical user interface, please install XQuartz.

SSH for Linux

Linux comes with SSH and X window support preinstalled: you do not have to do anything.