User Tools

Site Tools


keller_and_evans_lab:cu_research_computing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
keller_and_evans_lab:cu_research_computing [2016/04/29 07:38]
scott
keller_and_evans_lab:cu_research_computing [2019/10/08 10:20]
matthew_keller
Line 1: Line 1:
-General documentation for using RC is [[[https://www.rc.colorado.edu/support|on their website]]]. This document will mostly cover specific instructions for using it in the Vrieze and Keller labs. 
  
 +This document will mostly cover specific instructions for using RC in the Vrieze and Keller labs. We will try to update this, but RC is a bit of a moving target, so some of what is written below may now be outdated.
  
-======= Logging in ======= 
  
 +======= Getting started =======
 +
 +General documentation for using RC is [[[https://curc.readthedocs.io/en/latest/|on their website]]]. We recommend that ALL new users first read these overviews on that webpage. In particular: \\
 +Logging In <which you've already done>
 +Duo 2-factor Authentication <which you've already done>
 +Allocations
 +Node Types
 +Filesystems
 +The modules system
 +The PetaLibrary
 +Running applications with Jobs
 +Batch Jobs and Job Scripting
 +Interactive Jobs
 +Useful Slurm commands
 +Job Resource Information
 +squeue status and reason codes
 +Containerization on Summit
 +
 +
 +======= Overview of best practices =======
 +
 +This was written by Richard Border on Oct 8, 2019:
 +{{file_example.jpg}}
 +
 +
 +======= Logging in =======
  
 Put these settings in your ''~/.ssh/config'' file so you only have to enter your OTP once per session, instead of for every ssh connection you make Put these settings in your ''~/.ssh/config'' file so you only have to enter your OTP once per session, instead of for every ssh connection you make
Line 25: Line 50:
  
 These settings should work from Mac and Linux. I'm not sure how to do the equivalent from Windows with Putty. On a Mac, those settings will cause X11 to start. If you don't want that to happen, then remove the ''ForwardX11 yes'' line. These settings should work from Mac and Linux. I'm not sure how to do the equivalent from Windows with Putty. On a Mac, those settings will cause X11 to start. If you don't want that to happen, then remove the ''ForwardX11 yes'' line.
 +
 +For those with access to summit (ONLY!), here are the steps to using it:
 +  
 +  #From a login node:
 +  ssh -YC <uname>@shas07{01-15}
 +  
 +  #In your shell script:
 +  No need to include -A UCB00000442
 +   --partition=shas
 +  
 +  #To run R:
 +  ml load R
 +  ml load gcc
 +  R
 +
  
  
Line 63: Line 103:
  
 ======= Slurm ======= ======= Slurm =======
 +
 +
 +
 +====== Queues ======
 +
 +
 +#if you want to run on ibg himem, you need to load the right module
 +module load slurm/blanca
 +
 +#then in your shell script
 +#SBATCH --qos=blanca-ibg
 +
 +#If you want to run on normal queues, then:
 +module load slurm/slurm
 +
 +#then in your shell script, one of the below, depending on what queue you want
 +#SBATCH --qos=himem
 +#SBATCH --qos=crestone
 +#SBATCH --qos=janus
 +
 +
  
  
Line 68: Line 129:
  
  
-squeue -u <username>+ 
 +#To check our balance on our allocations and get the account id# 
 +sbank balance statement 
 +sacctmgr -p show user <username> #alternatively to find the acct# 
 + 
 +#To see how busy the nodes are. For seeing how many janus nodes are available, look for the  
 +#number under NODES where STATE is "idle" for PARTITION "janus" and TIMELIMIT 1-00:00:00. 
 +sinfo -l   
 + 
 +#checking on submissions for a user 
 +squeue -u <username>  #To see your job statuses (R is for running, PD pending, CG completing, CD completed, F failed, TO timeout)
 squeue -u <username> -t RUNNING squeue -u <username> -t RUNNING
 squeue -u <username> -t PENDING squeue -u <username> -t PENDING
 +squeue -u <username> --start #Get an estimate of when jobs will start
 +
 +#detailed information on a queue (who is running on it, how many cpus requested, memory requested, time information, etc.)
 +squeue -q blanca-ibg -o %u,%c,%e,%m,%j,%l,%L,%o,%R,%t | column -ts ','
 +
 +#current status of queues
 +qstat -i #To see jobs that are currently pending (this is helpful for seeing if queue is overbooked)
 +qstat -r #To see jobs that are currently running
 +qstat -a #To see jobs that are running OR are queued
 +qstat -a -n #To see all jobs, including which nodes they are running on
 +qstat -r -n #To see running jobs, and which nodes they are running on
 +
 +#other commands
 showq-slurm -o -U -q <partition>  #List job priority order for current user (you) in given partition showq-slurm -o -U -q <partition>  #List job priority order for current user (you) in given partition
-scontrol show jobid -dd <jobid>   #List detailed information for a job (useful for troubleshooting)+scontrol show jobid -dd <jobid>   #List detailed information for a job (useful for troubleshooting). More info [https://www.rc.colorado.edu/book/export/html/613 here]. 
 +pbsnodes -a #To look at the status of each node
  
 ### Once job has completed, you can get additional information  ### Once job has completed, you can get additional information 
Line 80: Line 165:
 sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed  #View same info for all jobs of user sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed  #View same info for all jobs of user
  
 +#To check graphically how much storage is being taken up in /work/KellerLab folder 
 +xdiskusage /work/KellerLab/sizes
  
  
-====== Controlling jobs ====== 
  
  
 +====== Running and Controlling jobs ======
 +
 +
 +sbatch <shell.script.name.sh> #run shell script
 +sinteractive --nodelist=bnode0102 #run interactive job on node "bnode0102"
 scancel <jobid>                  #Cancel one job scancel <jobid>                  #Cancel one job
 scancel -u <username>            #Cancel all jobs for user scancel -u <username>            #Cancel all jobs for user
Line 113: Line 204:
 This only needs to be done once. This only needs to be done once.
  
-Then launch your interactive job.+Then launch your interactive job on the IBG himem node.
  
 module load slurm/blanca && sinteractive --qos=blanca-ibg module load slurm/blanca && sinteractive --qos=blanca-ibg
 +
 +
 +Or onto any free himem node
 +
 +module load slurm/blanca && sinteractive --qos=blanca
  
  
keller_and_evans_lab/cu_research_computing.txt · Last modified: 2020/02/12 09:03 by lessem