User Tools

Site Tools


keller_and_evans_lab:cu_research_computing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
keller_and_evans_lab:cu_research_computing [2016/04/07 15:24]
lessem /* Only add the paths */
keller_and_evans_lab:cu_research_computing [2019/10/08 10:19]
matthew_keller
Line 1: Line 1:
-General documentation for using RC is [[[https://www.rc.colorado.edu/support|on their website]]]. This document will mostly cover specific instructions for using it in the Vrieze and Keller labs. 
  
 +This document will mostly cover specific instructions for using RC in the Vrieze and Keller labs. We will try to update this, but RC is a bit of a moving target, so some of what is written below may now be outdated.
  
-======= Logging in ======= 
  
 +======= Getting started =======
 +
 +General documentation for using RC is [[[https://curc.readthedocs.io/en/latest/|on their website]]]. We recommend that ALL new users first read these overviews. In particular: 
 +Logging In <which you've already done>
 +Duo 2-factor Authentication <which you've already done>
 +Allocations
 +Node Types
 +Filesystems
 +The modules system
 +The PetaLibrary
 +Running applications with Jobs
 +Batch Jobs and Job Scripting
 +Interactive Jobs
 +Useful Slurm commands
 +Job Resource Information
 +squeue status and reason codes
 +Containerization on Summit
 +
 +
 +======= Overview of best practices =======
 +
 +This was written by Richard Border on Oct 8, 2019:
 +{{file_example.jpg}}
 +
 +
 +======= Logging in =======
  
 Put these settings in your ''~/.ssh/config'' file so you only have to enter your OTP once per session, instead of for every ssh connection you make Put these settings in your ''~/.ssh/config'' file so you only have to enter your OTP once per session, instead of for every ssh connection you make
Line 25: Line 50:
  
 These settings should work from Mac and Linux. I'm not sure how to do the equivalent from Windows with Putty. On a Mac, those settings will cause X11 to start. If you don't want that to happen, then remove the ''ForwardX11 yes'' line. These settings should work from Mac and Linux. I'm not sure how to do the equivalent from Windows with Putty. On a Mac, those settings will cause X11 to start. If you don't want that to happen, then remove the ''ForwardX11 yes'' line.
 +
 +For those with access to summit (ONLY!), here are the steps to using it:
 +  
 +  #From a login node:
 +  ssh -YC <uname>@shas07{01-15}
 +  
 +  #In your shell script:
 +  No need to include -A UCB00000442
 +   --partition=shas
 +  
 +  #To run R:
 +  ml load R
 +  ml load gcc
 +  R
 +
  
  
Line 59: Line 99:
  
  
-You will have to manually create a directory to put their stuff in. You can also just make a big mess with files all over and annoy other users. lustre and rc_scratch are network filesystems, and will appear identical on all nodes that connect to them. /local/scratch is local to the particular node used, so something saved on bnode0108 will not be visible from himem04. The size of /local/scratch depends on which node is used, but it is about not large. rc_scratch is currently only about 20TBwhich lustre is 100s of TB. In the future lustre will be going away, and will probably be replaced by having rc_scratch grow to the size of lustre.+You will have to manually create a directory to put their stuff in. You can also just make a big mess with files all over and annoy other users. lustre and rc_scratch are network filesystems, and will appear identical on all nodes that connect to them. /local/scratch is local to the particular node used, so something saved on bnode0108 will not be visible from himem04. The size of /local/scratch depends on which node is used, but it is about not large. The "df" command does not work on rc_scratch, so it's unclear how much space is available. In the future lustre will be going away, and will probably be replaced by rc_scratch.
  
  
 ======= Slurm ======= ======= Slurm =======
 +
 +
 +
 +====== Queues ======
 +
 +
 +#if you want to run on ibg himem, you need to load the right module
 +module load slurm/blanca
 +
 +#then in your shell script
 +#SBATCH --qos=blanca-ibg
 +
 +#If you want to run on normal queues, then:
 +module load slurm/slurm
 +
 +#then in your shell script, one of the below, depending on what queue you want
 +#SBATCH --qos=himem
 +#SBATCH --qos=crestone
 +#SBATCH --qos=janus
 +
 +
  
  
Line 68: Line 129:
  
  
-squeue -u <username>+ 
 +#To check our balance on our allocations and get the account id# 
 +sbank balance statement 
 +sacctmgr -p show user <username> #alternatively to find the acct# 
 + 
 +#To see how busy the nodes are. For seeing how many janus nodes are available, look for the  
 +#number under NODES where STATE is "idle" for PARTITION "janus" and TIMELIMIT 1-00:00:00. 
 +sinfo -l   
 + 
 +#checking on submissions for a user 
 +squeue -u <username>  #To see your job statuses (R is for running, PD pending, CG completing, CD completed, F failed, TO timeout)
 squeue -u <username> -t RUNNING squeue -u <username> -t RUNNING
 squeue -u <username> -t PENDING squeue -u <username> -t PENDING
 +squeue -u <username> --start #Get an estimate of when jobs will start
 +
 +#detailed information on a queue (who is running on it, how many cpus requested, memory requested, time information, etc.)
 +squeue -q blanca-ibg -o %u,%c,%e,%m,%j,%l,%L,%o,%R,%t | column -ts ','
 +
 +#current status of queues
 +qstat -i #To see jobs that are currently pending (this is helpful for seeing if queue is overbooked)
 +qstat -r #To see jobs that are currently running
 +qstat -a #To see jobs that are running OR are queued
 +qstat -a -n #To see all jobs, including which nodes they are running on
 +qstat -r -n #To see running jobs, and which nodes they are running on
 +
 +#other commands
 showq-slurm -o -U -q <partition>  #List job priority order for current user (you) in given partition showq-slurm -o -U -q <partition>  #List job priority order for current user (you) in given partition
-scontrol show jobid -dd <jobid>   #List detailed information for a job (useful for troubleshooting)+scontrol show jobid -dd <jobid>   #List detailed information for a job (useful for troubleshooting). More info [https://www.rc.colorado.edu/book/export/html/613 here]. 
 +pbsnodes -a #To look at the status of each node
  
 ### Once job has completed, you can get additional information  ### Once job has completed, you can get additional information 
Line 79: Line 164:
 sacct -j <jobid> --format=JobID,JobName,MaxRSS,Elapsed     #Stats on completed jobs by jobID sacct -j <jobid> --format=JobID,JobName,MaxRSS,Elapsed     #Stats on completed jobs by jobID
 sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed  #View same info for all jobs of user sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed  #View same info for all jobs of user
 +
 +#To check graphically how much storage is being taken up in /work/KellerLab folder 
 +xdiskusage /work/KellerLab/sizes
 +
  
  
  
-====== Controlling jobs ======+====== Running and Controlling jobs ======
  
  
 +sbatch <shell.script.name.sh> #run shell script
 +sinteractive --nodelist=bnode0102 #run interactive job on node "bnode0102"
 scancel <jobid>                  #Cancel one job scancel <jobid>                  #Cancel one job
 scancel -u <username>            #Cancel all jobs for user scancel -u <username>            #Cancel all jobs for user
Line 113: Line 204:
 This only needs to be done once. This only needs to be done once.
  
-Then launch your interactive job.+Then launch your interactive job on the IBG himem node.
  
 module load slurm/blanca && sinteractive --qos=blanca-ibg module load slurm/blanca && sinteractive --qos=blanca-ibg
 +
 +
 +Or onto any free himem node
 +
 +module load slurm/blanca && sinteractive --qos=blanca
  
  
Line 180: Line 276:
  
 tabix -h chr${chr}/chr${chr}impv1.vcf.gz ${chr}:${startpos}-${endpos} | bgzip -c > chr$chr/chr${chr}impv1.${chr}_$startpos-$endpos.vcf.gz tabix -h chr${chr}/chr${chr}impv1.vcf.gz ${chr}:${startpos}-${endpos} | bgzip -c > chr$chr/chr${chr}impv1.${chr}_$startpos-$endpos.vcf.gz
 +
 +
 +
 +======= Compiling software =======
 +
 +RC intentionally keeps some header files off the login nodes to dissuage people from trying to compile on those nodes. Instead, use the janus-compile nodes to compile your software. Log in to a login node and then run
 +
 +ssh janus-compile[1-4]
  
  
keller_and_evans_lab/cu_research_computing.txt · Last modified: 2020/02/12 09:03 by lessem