Need help running your job on the cluster? Need a question answered about a particular tool or cluster resources?
However, please read this documentation first. If you find the documentation lacking something, let us know so we can update and improve it.
Whenever you have a question, please try to help us help you by providing the following information:
→ try these two commands:
showq
squeue
If you see several of your jobs using showq, but none with squeue, or if you did submit your jobs using qsub instead of sbatch/salloc, you are trying to use the old PBS/Torque/Maui system, which has been gradually phased out over the past year (2021/2022). Since June 2022, the last useful resources have been migrated to Slurm, so now it's definitively time to migrate your jobs. It' usually a simple question of replacing your #PBS directives by their #SLURM counterparts.
The same goes for the Open OnDemand-portal; weblogin.cluster.uni-hannover.de points to the obsolete batch system, whereas login.cluster.uni-hannover.de directs you to the current scheduler, Slurm.
→ First check whether you get any messages at all or if the system appears totally silent. If that is the case, check whether you are possibly “outside” of the LUH network. You'd then first need to connect via VPN.
→ If you can connect to the cluster from a shell (command line) via ssh, but are denied login graphically via X2Go, chances are high that you are over your quota (maximum disk space allocation) in the HOME directory and/or your grace period has expired. Use the command “checkquota” on the command line to verify this. Delete/Move files from within the ssh shell to get below your quota limits in HOME and then try again.
→ you probably try to run something on the login nodes, which are unsuitable for computations and anything larger. Therefore, there is a limit of 1800 cpu seconds, after which processes get terminated. 1800 cpu seconds should be enough for months of shell access, but will get used up quickly when you try to calculate something on multiple cpu cores. Read the docs how the cluster is intended to be used. As a first start, check out our OOD-Portal on https://login.cluster.uni-hannover.de and submit an interactive job. It is never a good idea to try anything more than small tests, editing or a (small) compilation on a login node, since they are shared between all users (simply try the command w
on the shell prompt to see how many others are working on the same machine).
→ use the command du -mad1 | sort -n
in the file system that checkquota tells you it sees the problem. You can quickly change to the file system using the environment variables we automatically set for you, cd
moves to your $HOME, cd $BIGWORK
and cd $PROJECT
to the other two major file systems.
Explanation:
du
= “check disk usage”, -m
= “show megabyte units” (to enable numerical sorting afterwards), -a
= “show all files, including those starting with a dot” (those are otherwise “hidden”, and some applications like e.g. Ansys use dot-directories e.g. for temporary files), -d1
= “only show files up to a depth of one (this level)” (otherwise the command would descend down the directory tree). |
= pass the output of du
to sort
, which numerically (-n
) sorts what it gets.
→ You are probably using X2Go, and you did not check the “Full Installation” checkmark. Please reinstall X2Go and make sure all fonts are installed.
→ Sending lots of very similar mail to external freemail providers leads to the uni-hannover.de Domain being blocked by those sites, which in turn leads to almost nobody external reveiving mail from LUH accounts or machines any more. Therefore, we ask you to use your institute's email account (the one ending in @xxxx.uni-hannover.de), and we also ask you to NOT setup an automatic forward to a Google, GMX or similar external freemailer account.