[ONLINE] Working effectively with HPC systems @ SNIC
Start: Tuesday, 20 April 2021 @ 08:00
End: Tuesday, 20 April 2021 @ 13:00Description:
The seminar will present useful tools and best practices for working effectively on HPC systems. It is expected to be of interest for a general HPC system user, both at a more familiar (intermediate) or starting (beginner) level. To participate, register down below.
Working efficiently with HPC starts with the tools you use to interact with the HPC system. It is also helpful to understand the general anatomy of HPC systems and storage. Following on from these fundamentals, we will give some recommendations for data organization on the system and examples of various types of file systems (e.g. parallel vs. local) and their individual strengths and weaknesses. We will then discuss the concepts of parallelism, scalability, scheduling and what types of OS and software you can expect of HPC systems. We will go through some important things to consider when building and installing software. Finally, we will look at different ways of running software on HPC systems and ways to monitor your software as it is running, with the aim of ensuring that your jobs are not poorly configured or wasting resources.
While the content and the practices are useful for HPC systems in general, we will show examples and tools more specific for the NSC clusters, e.g. Tetralith and Sigma.
The schedule for the day is divided into two main parts, before and after lunch break. The parts include several blocks of ca. 30 minutes including optional breaks. There will be opportunities for questions. Depending on time, the day might end with a longer questions session. The detailed schedule can be found at the NSC event page.
10:00 -12:00 Part I
12:00 -13:00 L u n c h
13:00 -15:00 Part II
Welcome, introductions and practicalities Tools at your end (e.g. terminal, ssh config., file transfer tools, VNC) HPC system anatomy (login and compute nodes, interconnect, storage) Properties and features of storage areas (e.g. quotas, performance, locality, backups, snapshots, scratch) Concept of parallelism (Amdahl’s law), scalability, scheduling and practical advice for good performance Software on an HPC system (OS, modules, python envs., concept of build envs., containers with Singularity) Ideas and strategies for organizing your workflow (data and file management, traceability and reproducibility) Interacting with the Slurm queueing system (requesting resources interactively or in batch) Practical example (preparing, submitting, monitoring and evaluating job efficiency)
The course materials, presentations, as well as detailed schedule etc., will be made available at the corresponding NSC event page. In future, refer to NSC past events.
Weine Olovsson and Hamish Struthers will present, Peter Kjellström and Torben Rasmussen will also help out during the sessions, all at NSC, LiU, Sweden.
- Workshops and courses