Why use a Cluster?


  • High Performance Computing (HPC) typically involves connecting to very large computing systems elsewhere in the world.
  • These other systems can be used to do work that would either be impossible or much slower on smaller systems.
  • HPC resources are shared by multiple users.
  • The standard method of interacting with such systems is via a command line interface.

Connecting to a remote HPC system


  • An HPC system is a set of networked machines.
  • HPC systems typically provide login nodes and a set of worker nodes.
  • The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
  • Files saved on one node are available on all nodes.

Exploring Remote Resources


  • An HPC system is a set of networked machines.
  • HPC systems typically provide login nodes and a set of compute nodes.
  • The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
  • Files saved on shared storage are available on all nodes.
  • The login node is a shared machine: be considerate of other users.

EPCC version - Working on a remote HPC system


  • “An HPC system is a set of networked machines.”
  • “HPC systems typically provide login nodes and a set of worker nodes.”
  • “The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).”
  • “Files saved on one node are available on all nodes.”

Scheduler Fundamentals


  • The scheduler handles how compute resources are shared between users.
  • A job is just a shell script.
  • Request slightly more resources than you will need.

HPCC version - Scheduler Fundamentals


  • The scheduler handles how compute resources are shared between users.
  • A job is just a shell script.
  • Request slightly more resources than you will need.

EPCC version - Working with the scheduler


  • “The scheduler handles how compute resources are shared between users.”
  • “Everything you do should be run through the scheduler.”
  • “A job is just a shell script.”
  • “If in doubt, request more resources than you will need.”

Environment Variables


  • Shell variables are by default treated as strings
  • Variables are assigned using “=” and recalled using the variable’s name prefixed by “$
  • Use “export” to make an variable available to other programs
  • The PATH variable defines the shell’s search path

Accessing software via Modules


  • Load software with module load softwareName.
  • Unload software with module unload
  • The module system handles software versioning and package conflicts for you automatically.

Transferring files with remote computers


  • wget and curl -O download a file from the internet.
  • scp and rsync transfer files to and from your computer.
  • You can use an SFTP client like FileZilla to transfer files through a GUI.

Running a parallel job


  • Parallel programming allows applications to take advantage of parallel hardware.
  • The queuing system facilitates executing parallel tasks.
  • Performance improvements from parallel execution do not scale linearly.

Using resources effectively


  • Accurate job scripts help the queuing system efficiently allocate shared resources.

Using shared resources responsibly


  • Be careful how you use the login node.
  • Your data on the system is your responsibility.
  • Plan and test large data transfers.
  • It is often best to convert many files to a single archive file before transferring.