Summary and Schedule
{% include gh_variables.html %}
This workshop is an introduction to using high-performance computing systems effectively. We can’t cover every case or give an exhaustive course on parallel programming in just two days’ teaching time. Instead, this workshop is intended to give students a good introduction and overview of the tools available and how to use them effectively.
Prerequisites
Command line experience is necessary for this lesson. We recommend the participants to go through shell-novice, if new to the command line (also known as terminal or shell).
By the end of this workshop, students will know how to:
- Identify problems a cluster can help solve
- Use the UNIX shell (also known as terminal or command line) to connect to a cluster.
- Transfer files onto a cluster.
- Submit and manage jobs on a cluster using a scheduler.
- Observe the benefits and limitations of parallel execution.
Getting Started
To get started, follow the directions in the “Setup” tab to download data to your computer and follow any installation instructions.
Note that this is the draft HPC Carpentry release. Comments and feedback are welcome.
For Instructors
If you are teaching this lesson in a workshop, please see the Instructor notes.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Why use a Cluster? |
Why would I be interested in High Performance Computing (HPC)? What can I expect to learn from this course? |
Duration: 00h 20m | 2. Connecting to a remote HPC system | How do I log in to a remote HPC system? |
Duration: 00h 55m | 3. Exploring Remote Resources |
How does my local computer compare to the remote systems? How does the login node compare to the compute nodes? Are all compute nodes alike? |
Duration: 01h 30m | 4. EPCC version - Working on a remote HPC system |
“What is an HPC system?” “How does an HPC system work?” “How do I log on to a remote HPC system?” ::: |
Duration: 02h 05m | 5. Scheduler Fundamentals |
What is a scheduler and why does a cluster need one? How do I launch a program to run on a compute node in the cluster? How do I capture the output of a program that is run on a node in the cluster? |
Duration: 03h 20m | 6. HPCC version - Scheduler Fundamentals |
What is a scheduler and why does a cluster need one? How do I launch a program to run on a compute node in the cluster? How do I capture the output of a program that is run on a node in the cluster? |
Duration: 04h 35m | 7. EPCC version - Working with the scheduler |
“What is a scheduler and why are they used?” “How do I launch a program to run on any one node in the cluster?” “How do I capture the output of a program that is run on a node in the cluster?” ::: |
Duration: 05h 55m | 8. Environment Variables |
How are variables set and accessed in the Unix shell? How can I use variables to change how a program runs? |
Duration: 06h 10m | 9. Accessing software via Modules | How do we load and unload software packages? |
Duration: 06h 55m | 10. Transferring files with remote computers | How do I transfer files to (and from) the cluster? |
Duration: 07h 25m | 11. Running a parallel job |
How do we execute a task in parallel? What benefits arise from parallel execution? What are the limits of gains from execution in parallel? |
Duration: 08h 55m | 12. Using resources effectively |
How can I review past jobs? How can I use this knowledge to create a more accurate submission script? |
Duration: 09h 25m | 13. Using shared resources responsibly |
How can I be a responsible user? How can I protect my data? How can I best get large amounts of data off an HPC system? |
Duration: 09h 45m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
There are several pieces of software you will wish to install before the workshop. Though installation help will be provided at the workshop, we recommend that these tools are installed (or at least downloaded) beforehand.
Bash and SSH
This lesson requires a terminal application (bash
,
zsh
, or others) with the ability to securely connect to a
remote machine (ssh
).
Where to Type Commands: How to Open a New Shell
The shell is a program that enables us to send commands to the computer and receive output. It is also referred to as the terminal or command line.
Some computers include a default Unix Shell program. The steps below describe some methods for identifying and opening a Unix Shell program if you already have one installed. There are also options for identifying and downloading a Unix Shell program, a Linux/UNIX emulator, or a program to access a Unix Shell on a server.
Unix Shells on Windows
Computers with Windows operating systems do not automatically have a Unix Shell program installed. In this lesson, we encourage you to use an emulator included in Git for Windows, which gives you access to both Bash shell commands and Git. If you have attended a Software Carpentry workshop session, it is likely you have already received instructions on how to install Git for Windows.
Once installed, you can open a terminal by running the program Git Bash from the Windows start menu.
Shell Programs for Windows
- Git for Windows – Recommended
- Windows Subsystem for Linux – advanced option for Windows 10
Alternatives to Git for Windows
Other solutions are available for running Bash commands on Windows. There is now a Bash shell command-line tool available for Windows 10. Additionally, you can run Bash commands on a remote computer or server that already has a Unix Shell, from your Windows machine. This can usually be done through a Secure Shell (SSH) client. One such client available for free for Windows computers is PuTTY. See the reference below for information on installing and using PuTTY, using the Windows 10 command-line tool, or installing and using a Unix/Linux emulator.
For advanced users, you may choose one of the following alternatives:
- Install the Windows Subsystem for Linux
- Use the Windows PowerShell
- Read up on Using a Unix/Linux emulator (Cygwin) or Secure Shell (SSH) client (PuTTY)
Warning
Commands in the Windows Subsystem for Linux (WSL), PowerShell, or Cygwin may differ slightly from those shown in the lesson or presented in the workshop. Please ask if you encounter such a mismatch – you’re probably not alone.
Unix Shell on macOS
On macOS, the default Unix Shell is accessible by running the
Terminal program from the /Application/Utilities
folder in
Finder.
To open Terminal, try one or both of the following:
- In Finder, select the Go menu, then select Utilities. Locate Terminal in the Utilities folder and open it.
- Use the Mac ‘Spotlight’ computer search function. Search for:
Terminal
and press Return.
For an introduction, see How to Use Terminal on a Mac.
Unix Shell on Linux
On most versions of Linux, the default Unix Shell is accessible by running the (Gnome) Terminal or (KDE) Konsole or xterm, which can be found via the applications menu or the search bar.
SSH for Secure Connections
All students should have an SSH client installed. SSH is a tool that allows us to connect to and use a remote computer as our own.
SSH for Windows
Git for Windows comes with SSH preinstalled: you do not have to do anything.
GUI Support for Windows
If you know that the software you will be running on the cluster requires a graphical user interface (a GUI window needs to open for the application to run properly), please install MobaXterm Home Edition.
SSH for macOS
macOS comes with SSH pre-installed: you do not have to do anything.
GUI Support for macOS
If you know that the software you will be running requires a graphical user interface, please install XQuartz.