User Tools

Site Tools


This is an old revision of the document!


The Department of Computer Science hosts a computing cluster for scientific workloads. Please also consider the HPC Services offered by our ZID as they may suit your requirements better.

Hardware

The System runs in 3W06 in the ICT Building. It consists of 17 Nodes with the following specification

Motherboard: ASRock Fatal1ty X399 Professional Gaming
CPU: AMD Ryzen Threadripper 2950X WOF
Memory: G.Skill D4128GB 2400-15 Flare X K8
SSD: WD Black SN750 500 GB, NVMe 1.3 , Read: 3470 MB/s, Write: 2600 MB/s

Nodes gc1-gc8 offer

GPU: 4x ZOTAC GeForce RTX 2070 Blower, 8 GB (GDDR6, 256 Bit)

Nodes 9-16 offer

GPU: 4x ASUS GeForce RTX 2070 SUPER TURBO EVO, 8 GB (GDDR6, 256 Bit)

Nodes 17 offer

GPU: 1x NVIDIA TITAN RTX, 24 GB (GDDR6, 384 Bit)

The head node (the one without GPUs) has 2TB additional storage capacity that is reachable from all nodes.

Network

The nodes are interconnected through a Gigabit Ethernet Switch. The upstream connection of the head node is Gigabit Ethernet. The nodes have 10 Gigabit Connections for MPI.

System Configuration

  • All nodes run the latest Version of Ubuntu Server LTS which is 18.04.2 (Bionic Beaver).
  • All nodes have the same home directories mounted.
  • SLURM is used as job scheduler.
  • ifi-auth is used as authentication backed.

Storage

Your home directory /home/name.surname should not exceed 100GB.Permission should be 700 in case you do not share it.

$chmod 700 /home/name.surname

A 18TB HDD space is available to the GPU Cluster as a scratch space. It uses the 1G network.

Mount point is : /scratch on each node

for better usage please:

create a directory for yourself using the same username you have for the cluster and set accessibility only to you.

(in case you need to share a directory give permissions for group/others 750 or 755)

$cd /scratch
$mkdir name.surname
$chmod -R 700 /scratch/name.surname

The IFI-NAS storage is mounted at /ifi-NAS/[your_group]. Information about the IFI_NAS is available here

Installed Software Packages

Ubuntu Distribution Packages:

  • Python 2.7
  • Python 3.6
  • gcc 7.4
  • GNU Make 4.1

Loadable Modules:

  • Open MPI 4.0.0 (/software-shared)
  • mpich 3.3.1 (/software-shared)

Modules

Environment Modules can be used to modify a users environment. Use this to dynamically load modules in your sbatch script.

basic commands:

module avail -- list available modules
module show [module name] -- show details about one module
module help [module name] -- show help from one module

example usage in sbatch-script:

module load openmpi
module unload openmpi 

Authorization

To request access to ifi-cluster users have to subscribe to the mailing list.

Usage

After registration you can log into the system with your IFI credentials using ssh on ifi-cluster.uibk.ac.at.

Show running Jobs

squeue

Submit Job

Batch Job

write a configuration file similar to this basic example

cluster-test.sh
#!/bin/bash -l
#SBATCH --partition=IFIgpu
#SBATCH --job-name=firstDemo
#SBATCH --mail-type=BEGIN,END,FAIL ##optional
#SBATCH --mail-user=max.mustermann@uibk.ac.at ##optional
#SBATCH --account=your_group ##change to your group
#SBATCH --uid=your_username  ##change to your username
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --mem=4G
#SBATCH --time=0-00:30:00
#SBATCH --output slurm.%N.%j.out # STDOUT
#SBATCH --error slurm.%N.%j.err # STDERR
srun /bin/hostname | /usr/bin/sort
OptionDescription
--partition= specifies on which partition you want to run your job. Available partitions are: IFIall-for CPU computation, IFIgpu -for GPU computation , IFItitan - for large GPU computation on a nVidia Titan GPU Card
--job-name= you can define a name for your job (this field is optional)
--mail-type= to receive an email if desired. Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL, +others see official documentation (this field is optional)
--mail-user= specify the users email. it could be any UIBK email. if --mail-type is specified to other then NONE and --mail-user is missing emails will be send to local user on the IFI cluster (check it with “mail” command on headnode)
--account= specify the group you are belonging to
--uid= specify your_username
--nodes= on how many nodes should your job run. the scheduler will allocate the number of nodes to your job
--ntasks-per-node= Request that ntasks be invoked on each node
--mem= Specify the real memory required per node. Default units are megabytes. Different units can be specified using the suffix [K|M|G|T]
--time= Set a limit on the total run time of the job allocation. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely)
--output=<filename pattern> the batch script's standard output will be directly to the file name specified in the “filename pattern”. By default both standard output and standard error are directed to the same file
--error=<filename pattern> the batch script's standard error will be directly to the file name specified in the “filename pattern”.

srun /bin/hostname | /usr/bin/sort -- the command that you want to run

Run the file with sbatch --test-only <<filename>> to check the syntax and with sbatch «filename» to submit it.

Other valid and useful options:

OptionDescription
--exclusive[=user] The job allocation can not share nodes with other running jobs (or just other users with the “=user” option).
--gres=<list> Specifies a comma delimited list of generic consumable resources (GPUs). The format of each entry on the list is “name:type:count” (i.e - -gres=gpu:4 )

There is also an example script for C-Compilation.

Use the official Documentation website for details: https://slurm.schedmd.com/sbatch.html

Interactive Job

srun /bin/hostname

Interactive shell session

srun --nodes=1 --nodelist=gc3 --ntasks-per-node=1 --time=01:00:00 --pty bash -i

will get you an shell on gc3 node for one hour

Cancel Job

scancel <<jobid>>

The basic commands can be found in the SLURM User Guide

Live Monitoring Tool

The live monitoring tool for the cluster can be found atHERE

Contact

Contact ifi-sysadmin@informatik.uibk.ac.at if you have problem or further questions. For administration purposes there is also the internal documentation.

Internal Chat

IFI Internal Chat Room for the GPU Cluster can be accessed at Matrix Chat

Last modified: 2021/03/24 15:33 by Mircea Munteanu