Research Computing: Using our HPC (Pronghorn) to run analyses via the workload manager (SLURM)

Introduction to Pronghorn and HPC ~ 2.5h
- Introduction to Pronghorn
- Introduction to SLURM
- Connecting to Pronghorn
- Using screen command
- Your first SLURM job
- Conda: creating a new environment for reproducible analysis
- SLURM interactive session for testing
- Writing your first SBATCH script
- Transferring files from Pronghorn to your local computer
Advanced Shell Scripting and parallel processing with Array jobs on Pronghorn ~ 1.5 Hour
- Processing Multiple samples with an array SBATCH script
- Singularity containers
- Pronghorn wrap up

Introduction to Pronghorn

Pronghorn is the University of Nevada, Reno’s High-Performance Computing (HPC) cluster. The GPU-accelerated system is designed, built and maintained by the Office of Information Technology’s HPC Team. Pronghorn and the HPC Team supports general research across the Nevada System of Higher Education (NSHE).

Pronghorn is composed of CPU, GPU, Visualization, and Storage subsystems interconnected by a 100Gb/s non-blocking Intel Omni-Path fabric. The CPU partition features 108 nodes, 3,456 CPU cores, and 24.8TiB of memory. The GPU partition features 44 NVIDIA Tesla P100 GPUs, 352 CPU cores, and 2.75TiB of memory. The Visualization partition is composed of three NVIDIA Tesla V100 GPUs, 48 CPU cores, and 1.1TiB of memory. The storage system uses the IBM SpectrumScale file system to provide 2PiB of high-performance storage. The computational and storage capabilities of Pronghorn will regularly expand to meet NSHE computing demands.

Pronghorn is collocated at the Switch Citadel Campus located 25 miles East of the University of Nevada, Reno. Switch is the definitive leader of sustainable data center design and operation. The Switch Citadel is rated Tier 5 Platinum, and will be the largest, most advanced data center campus on the planet.

Pronghorn is available to all University of Nevada, Reno faculty, staff, students, and sponsored affiliates. Priority access to the system is available for purchase.

First up, let’s talk about what a high-performance computer (HPC) is: really, it is a bunch of individual computers (“nodes”), just like the ones you are using, strung together with networking cables, with the ability to deploy “jobs” (some computational task you are trying to accomplish) across multiple nodes easily. As such, we can determine how many cores we have access to by counting the number of cores on each individual node, and summing them all up. Pronghorn has 3,456 CPU cores that (in theory) we have access to! In a perfect world (more on that later), you COULD divide the amount of time it takes to do a job by the number of cores you throw at it. With Pronghorn, you could theoretically do 10 YEARS of sequential calculations in less than one day! Put another way, Pronghorn’s capabilities are 864 times faster than my Windows machine.

Your desktop or laptop is all yours, generally, so you aren’t sharing the resources with anyone else. You’ve effectively pre-paid for ~ 5 years if computational time (warranty!) times the number of cores you have, so I’ve bought about 20 years of CPU-time on my Windows desktop and 40 years of CPU-time on my Mac laptop. Pronghorn, assuming a 5 year lifespan, has 17,280 years (!) of CPU-time, all of which was purchased in advance. While you are probably ok with your laptop/desktop just sitting there idle not doing much, a research computer like Pronghorn is designed to be used at near-capacity! Also, this is a SHARED MACHINE and as such much of the process getting your programs to run on it requires some understanding of how the system shares its resources amongst all the users! Enter SLURM.

Introduction to SLURM

SLURM is what is known as a workload manager. SLURM’s job is to take the vast number of different jobs sent to it by all users in the system, reserve “resources” (# of nodes per job, # of cores per node, memory per job), and then execute the jobs based on the user or association’s priority.

A “job” is basically the top level of what you are trying to accomplish – a workflow, set of commands/programs to run, etc. Typically we define a single job at a time and submit it to the SLURM system. Within the job are “steps” which can be running sequentially or in parallel depending on the particulars of your workflow. A step consists of one or more “tasks”. Each “task” runs on one or more “cpus” (cpu is the same as a logical core in SLURM parlance). Parallelization can occur at multiple levels: job, step, and task.

SLURM uses a “job script” written in any interpreted language that uses “#” as the comment character– typically we’ll use the “bash” language to create a job. This job script follows a very specific format that you will get familiar with. Your job script 1) tells SLURM what resources you need, and 2) once the resources are allocated, what programs to execute and how to allocate the resources to those programs.

As a general rule, Pronghorn is a BATCH system, which means you will focus on jobs that do not require user interaction, and will often be deferred (run at some time in the future). While you CAN run “interactive jobs” on Pronghorn, this should be minimized wherever possible. Interactive jobs typically idle resources quite a bit.

Last session, we covered connecting to Pronghorn and running interactive commands on the command line. These were key functionality commands such as: moving/copying/removing files, making directories, downloading data from the internet, searching through a file, etc. Our first tutorial today will cover these utilities using real sequencing data as an example.

Additionally, today we will be covering using Pronghorn/SLURM to run our analyses. A lot of software we run for genetics/genomics analysis will run for many hours or days. In order to run these computational resource intensive programs, we will need to request resources to Pronghorn using the SLURM scheduler. To do this, we must write “scripts” (a text document with the commands we want to run) to submit our analyses to the scheduler which then get ran when time/resources are available.

Connecting to Pronghorn

In order to connect to pronghorn, we will be using ssh to connect to the remote server. Below is how you would connect using a Linux or Mac OSX computer using a TERMINAL program. Both of these operating systems should have a terminal installed by default.


ssh yournetidhere@pronghorn.rc.unr.edu

If this is your first time connecting to the server, you may need to accept connecting to the remote server by typeing “yes”.


The authenticity of host 'pronghorn.rc.unr.edu (134.197.76.4)' can't be established.
ED25519 key fingerprint is SHA256:QAFX5eUaSvFi3/+IRuP6Zm8RM6OcGRZb5vySBgq/yZ4.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? Yes

You will then be prompted to type in your netid password. As you type, the cursor will not move/display text in order to keep your password secure.

An alternative terminal that I like on OSX is iTerm https://iterm2.com/

Windows does not have ssh functionality built into the system. You will need to setup a program in order to remotely connect to Pronghorn. Visit one of the websites and download the appropriate installation file for your computer.

WSL (tutorial to enable WSL) https://www.omgubuntu.co.uk/how-to-install-wsl2-on-windows-10

PUTTY (installation) https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html

Once installed, you will configure the connection information similar to above.

Using screen command

It will become more apparent as we progress today, but using the screen command is very useful. This section will show you how to configure and use screen.

When you connect to Pronghorn via ssh/PUTTY, you are presented one window to do all your work. However, there are times where you may be editing a file and want to look up a file path/location in order to configure it within your script you are writing. In order to enable this, you can use the screen command. Screen will create virtual windows on the remote system so you can perform multiple actions at the same time.

First, we need to download a screen configuration file in our home directory called “.screenrc”. I have one hosted on our core’s github page here:

https://raw.githubusercontent.com/Nevada-Bioinformatics-Center/unix_configurations/main/.screenrc

This configuration file sets up how the windows will be displayed when you use screen.

Use wget to download this file to your home directory


(base) hvasquezgross@jimkent:~$ wget https://raw.githubusercontent.com/Nevada-Bioinformatics-Center/unix_configurations/main/.screenrc

Now that we have the configuration file, let’s start a session:


(base) hvasquezgross@jimkent:~/training$ screen

Now that you are running screen, you will see the bottom portion of your terminal has some new text.

In green towards the bottom left of your screen, you have the hostname of the computer you are on, [login-1] in my case.

Next to this you have your virutal windows numbered from 0 to N. By default right after you run screen, you will have 1 named window as (0*$bash).

Let’s create a new screen window by pressing the CTRL+A keys together. These two keys together tells the server that the next key that is pressed will be a screen command.

Press the “c” key after you press CTRL+A keys. This will Create a new window labeled (1*$bash).

You can then switch between these virtual windows by pressing the CTRL+A keys together then typing the number of the window you want to view.

Test this now.

When you are finished working, you can close your terminal to end your session. Then, the next day you want to work on the project you can SSH to Pronghorn and then issue the screen -x command to restore your previous screen session. Pretty Nifty!

Screen cheat sheet:

Your first SLURM Job

We are going to try to keep each batch program neatly organized, and running in its own folder. So let’s go ahead and make a new folder and cd to it:


mkdir ~/pronghorn_workshop/

Now let’s set up this module’s folder.


cd ~/pronghorn_workshop/
mkdir example1
cd example1

Let’s first create a bash script that will run our program. All this is going to do is “sleep” (wait) for 15 seconds, and then print out the hostname of the computer the program runs on.


nano example1.sh

And add in this text (remember to skip the ###). Copy and paste may be a little weird in an ssh window. You will likely need to click in the window and right click to paste.


###

#!/bin/bash

# Spike a cpu for 15 seconds:
timeout 15s cat /dev/zero > /dev/null

# Print out the hostname
hostname

# Safely exit:
exit 0

###

Your nano clipboard (Ctrl+U) is not the system clipboard (right click and paste) Use Ctrl+X to exit nano, press Y to save changes.

We are going to first test this on the login node. AS A WARNING, DO NOT GET IN THE HABIT OF RUNNING CODE ON THE LOGIN NODES, THEY ARE NOT DESIGNED FOR COMPUTE WORKLOADS.

In one of these screen windows, please type:


htop -u $USER

How many logical cores do you see at the top? login-0 and login-1 are actually the same type of hardware, but in login-0’s case, simultaneous multithreading (“SMT”) has been turned on, in which each physical cpu can run two threads, thus resulting in 32 physical cpus x 2 threads = 64 logical cpus. On login-1 SMT is disabled, so you will see 32 cpus.

We are going to test our script now on the login nodes. Keep an eye on your processes in htop while in the other session start it running:


bash example1.sh

You’ll see cat /dev/zero run for 15 seconds, note the CPU usage. It will probably be 100% unless the node is being overutilized. If you miss it, try running it again.

At the end, the process returns the hostname of the node it ran on!

OK. The program works! Now we want to modify this program and submit it to the queue.

Create an SBATCH script as follows:


nano example1_batch.sh

Edit the file as follows (notice you are just adding a bunch of #SBATCH commands), and change the [yourEMAIL] block to your actual email address, and update the partition and account information from the Pronghorn welcome email.


#!/bin/bash

#SBATCH --job-name=example1_job
#SBATCH --output=example1_output.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2GB
#SBATCH --hint=nomultithread
#SBATCH --time=00:01:00
#SBATCH --partition=[your partition]
#SBATCH --account=[your account]
#SBATCH --mail-user=[yourEMAIL]
#SBATCH --mail-type=ALL

bash example1.sh

###

Control-X, Y to save it.

You will notice the script starts with #!/bin/bash. The #! characters together are known as “shbang” for short. You may recall from our last workshop, we used the bang (!) character to rerun a particular command. The sharp (#) character prefixed tells linux which shell interperter to use when running the script/analysis.

Ok, time to submit your job to the queue! Rather than “bash [your file]” you are going to:


sbatch example1_batch.sh

You should see it say “Submitted batch job XXXXXX”.

Note the “jobid” and then check the queue status by typing (be quick, this will only show that a job is in the queue or is currently running):


squeue --user $USER

You will see something like:


           2550561 cpu-core- example1 hvasque  R       0:04      1 cpu-26

(If your status is PD the job is queued)

Type squeue in a few more times (or press the up arrow and return) and watch its progress.

Hint: you can also use this to watch your jobs in “realtime”. Note it stops after ~15 seconds. We put timeout 30 to make sure you aren’t abusing the squeue command. You should generally avoid watching the squeue.


timeout 30 watch squeue --user $USER

The first column is your job ID, the second is the PARTITION you submitted to, the third is the jobname you gave it, then your username, then the “Status” (ST) which likely says “R” which means its running, or “PD” if it is “pending”, meaning its waiting for resources, then the number of nodes you asked for (1), then the node it is running on (cpu-XX). This may not start running immediately!!! You are sharing this with other people!

You can get finer details about running jobs by typing:


sstat [jobid]

And if you want to cancel the job (don’t do this right now):


scancel [jobid]

Once it’s done (you no longer see it in squeue) check the output file it created by:


less example1_output.txt

You should see it return the hostname of the computer the program ran on, cpu-XX – not the login nodes as previously.

While this was all happening, check your email. You should have gotten a notification when your job began and, afterwards, when it ended. The email notifications are not required, but can be helpful when you are trying to plan and track jobs.

Let’s break down this job script a bit more. All SBATCH directives must appear together at the top of a submission script, and should generally be formatted as “–=”

–job-name: a name for your job that will appear in the queue –output: sends standard output from a job into a filename(s) of your choice. Note you can also have your script to this explicitly. –error: sends standard error from a job into a filename(s) of your choice. Note you can also have your script to this explicitly. –nodes: how many nodes can be used for the job? –ntasks: how many unique “tasks” will your script run? –cpus-per-task: for one task, how many cpus are allocated to it? If your task isn’t “internally parallelized”, this should be set to 1 cpu per task. –mem-per-cpu: the maximum needed memory PER CPU. The nodes we are using have ~256GB RAM total (232GB is usable), and 64 logical cores, so evenly distributed we have ~ 4GB per cpu. You can ask for more, however. –hint=nomultithread: if your process does not take advantage of SMT, it might make understanding code execution easier to disable it. –time: the maximum amount of time your script will take, after which it will be killed. As a general rule, be generous with this time. The more accurate you are, the faster your job will begin running. –partition: the “queue” name to use. –account: the account your jobs are being charged to. –mail-user: the email address to send notifications to (remove this if you don’t want emails) –mail-type=what kind of notifications to email

Congratulations, you just wrote your first BASH script, SLURM submission script, and ran this analysis using the SLURM scheduler.

Conda: creating a new environment for reproducible analysis

If you wanted to run an analysis that involved trimming reads and mapping to the genome, these programs are not installed on pronghorn by default. However, most bioinformatic software can be easily installed with conda which we setup last session. We will need to do a few more configuration options, then we will create our first environment for this type of analysis.


(base)conda create -n workshoptools -c conda-forge curl jq
## Package Plan ##

  environment location: /home/hvasquezgross/miniconda3/envs/workshoptools

  added / updated specs:
    - curl
    - jq

[edited out]

Proceed ([y]/n)? 

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate workshoptools
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Retrieving notices: ...working... done

Lets break down this command. We are using conda with the “create” sub command. We are using the -n option to give the new environment the name of “workshoptools”. Then, we are using the -c option to tell conda to look in the “bioconda” channel. Then we have a list of programs we want to install: jq and curl.

Type Y to proceed with the installation.

You can search a list of conda installable packages from the website here: https://anaconda.org/

Now that the installation is finished, test running the jq command.


(base) [hvasquezgross@login-1 training]$ jq
bash: jq: command not found...

Why didn’t the command work? Didn’t we just install it?

Yes, we did! However, it was loaded to a specific environment, not the “base” environment we are currently using. We need to activate this conda environment to have access to jq.

(base) [hvasquezgross@login-1 training]$ conda activate workshoptools
(workshoptools) [hvasquezgross@login-1 training]$ jq
jq - commandline JSON processor [version 1.6]

Usage:  jq [options] <jq filter> [file...]
        jq [options] --args <jq filter> [strings...]
        jq [options] --jsonargs <jq filter> [JSON_TEXTS...]

jq is a tool for processing JSON inputs, applying the given filter to
its JSON text inputs and producing the filter's results as JSON on
standard output.

The simplest filter is ., which copies jq's input to its output
unmodified (except for formatting, but note that IEEE754 is used
for number representation internally, with all that that implies).

For more advanced filters see the jq(1) manpage ("man jq")
and/or https://stedolan.github.io/jq

Example:

        $ echo '{"foo": 0}' | jq .
        {
                "foo": 0
        }

For a listing of options, use jq --help.

Great, now you know how to install programs using conda on Pronghorn!

For your work, you may find programs not available in conda, but available as a pip command. For these programs, I still like creating a conda environment with pip. This ensures that the pip installation is from a clean/brand new python installation.

SLURM interactive session for testing

Now that we know the basic BASH commands and how to install new programs with conda on Pronghorn, we should be able to perform our first sequence analysis.

Often times when I’m developing/testing a new analysis method, I will request an interactive BASH session on Pronghorn. This will request computational resources and give you a bash prompt to run programs on a compute node, rather than the login node which we used earlier.


(base) [hvasquezgross@login-1 training]$ srun --job-name \"InteractiveJob\" --ntasks-per-node=8 --mem=20G --account=cpu-s1-inbre-0 --partition=cpu-s1-inbre-0 --time=24:00:00 --pty bash
srun: job 4479989 queued and waiting for resources
srun: job 4479989 has been allocated resources
(base) [hvasquezgross@cpu-53 training]$

Notice, after running the command, the @login-1 changes to @cpu-53 which is the specific compute node which was allocated for my desired configuration: 8 CPUs and 20Gigs of memory for 24 hours.

Now, we can load our sequence analysis conda environment and test some of the associated commands.

(base) [hvasquezgross@cpu-53 training]$ conda activate workshoptools
(workshoptools) [hvasquezgross@cpu-53 training]$ curl
curl: try 'curl --help' or 'curl --manual' for more information
(workshoptools) [hvasquezgross@cpu-53 training]$ jq

We are going to be getting weather data from https://wttr.in via the command line using curl. You can find the documentation on the project’s github page here: https://github.com/chubin/wttr.in

Let’s test out some commands. We will start with using curl, to get the default results from wttr.in

(workshoptools) [hvasquezgross@cpu-53 training]$ curl wttr.in
Weather report: Reno, Nevada, United States

     \  /       Partly cloudy
   _ /"".-.     +55(51) °F     
     \_(   ).   ↑ 5 mph        
     /(___(__)  9 mi           
                0.0 in         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Wed 25 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │               Overcast       │     \   /     Clear          │
│      .-.      +53(48) °F     │      .-.      +59(53) °F     │      .--.     +46(39) °F     │      .-.      +37(30) °F     │
│   ― (   ) ―   ↗ 13-14 mph    │   ― (   ) ―   ↗ 17-19 mph    │   .-(    ).   ↗ 6-7 mph      │   ― (   ) ―   ↗ 13-18 mph    │
│      `-’      6 mi           │      `-’      6 mi           │  (___.__)__)  6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │               0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Thu 26 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      +33(30) °F     │      .-.      +42(41) °F     │      .-.      +42(41) °F     │      .-.      +35(32) °F     │
│   ― (   ) ―   ↓ 3 mph        │   ― (   ) ―   ↙ 3-4 mph      │   ― (   ) ―   ↙ 4-8 mph      │   ― (   ) ―   ← 3-8 mph      │
│      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Fri 27 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      41 °F          │      .-.      50 °F          │      .-.      +44(42) °F     │      .-.      41 °F          │
│   ― (   ) ―   → 1 mph        │   ― (   ) ―   ↓ 3 mph        │   ― (   ) ―   ↓ 4-9 mph      │   ― (   ) ―   ↓ 1-3 mph      │
│      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘

But, what if we want to get data for a different city (Detroit)? The github link above has the documentation.

(workshoptools) [hvasquezgross@cpu-53 training]$ curl wttr.in/Detroit
Weather report: Detroit

                Overcast
       .--.     68 °F          
    .-(    ).   ↗ 4 mph        
   (___.__)__)  9 mi           
                0.0 in         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Wed 25 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│               Overcast       │               Overcast       │               Overcast       │               Overcast       │
│      .--.     60 °F          │      .--.     66 °F          │      .--.     60 °F          │      .--.     60 °F          │
│   .-(    ).   ↑ 9-14 mph     │   .-(    ).   ↗ 8-11 mph     │   .-(    ).   ↗ 6-9 mph      │   .-(    ).   ↗ 3-6 mph      │
│  (___.__)__)  6 mi           │  (___.__)__)  6 mi           │  (___.__)__)  1 mi           │  (___.__)__)  1 mi           │
│               0.0 in | 0%    │               0.0 in | 0%    │               0.0 in | 0%    │               0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Thu 26 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│    \  /       Partly cloudy  │               Cloudy         │     \   /     Sunny          │  _`/"".-.     Patchy rain po…│
│  _ /"".-.     60 °F          │      .--.     69 °F          │      .-.      +75(78) °F     │   ,\_(   ).   68 °F          │
│    \_(   ).   ↑ 8-14 mph     │   .-(    ).   ↑ 11-13 mph    │   ― (   ) ―   ↑ 11-18 mph    │    /(___(__)  ↑ 14-23 mph    │
│    /(___(__)  6 mi           │  (___.__)__)  6 mi           │      `-’      6 mi           │      ‘ ‘ ‘ ‘  6 mi           │
│               0.0 in | 0%    │               0.0 in | 0%    │     /   \     0.0 in | 0%    │     ‘ ‘ ‘ ‘   0.0 in | 79%   │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Fri 27 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │  _`/"".-.     Patchy rain po…│  _`/"".-.     Patchy rain po…│
│      .-.      64 °F          │      .-.      71 °F          │   ,\_(   ).   +75(78) °F     │   ,\_(   ).   69 °F          │
│   ― (   ) ―   ↗ 8-12 mph     │   ― (   ) ―   ↗ 11-13 mph    │    /(___(__)  ↗ 10-16 mph    │    /(___(__)  ↗ 8-14 mph     │
│      `-’      6 mi           │      `-’      6 mi           │      ‘ ‘ ‘ ‘  6 mi           │      ‘ ‘ ‘ ‘  5 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     ‘ ‘ ‘ ‘   0.0 in | 87%   │     ‘ ‘ ‘ ‘   0.0 in | 69%   │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
Location: Detroit, Wayne County, Michigan, United States of America [42.3486635,-83.0567374]

What happens when we specify Reno in the URL like we did for Detroit?

It looks like we get data, but you will notice at the bottom of the output, it thinks Reno, is in Germany.

(workshoptools) [hvasquezgross@cpu-53 training]$  curl wttr.in/Reno
Weather report: Reno

      \   /     Clear
       .-.      +46(41) °F     
    ― (   ) ―   → 16 mph       
       `-’      6 mi           
      /   \     0.0 in         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Thu 26 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│               Cloudy         │  _`/"".-.     Patchy rain po…│  _`/"".-.     Patchy rain po…│    \  /       Partly cloudy  │
│      .--.     48 °F          │   ,\_(   ).   50 °F          │   ,\_(   ).   +50(46) °F     │  _ /"".-.     +48(44) °F     │
│   .-(    ).   ↗ 1-2 mph      │    /(___(__)  ↖ 3-4 mph      │    /(___(__)  ↗ 6-9 mph      │    \_(   ).   ↗ 6-11 mph     │
│  (___.__)__)  6 mi           │      ‘ ‘ ‘ ‘  6 mi           │      ‘ ‘ ‘ ‘  6 mi           │    /(___(__)  6 mi           │
│               0.0 in | 0%    │     ‘ ‘ ‘ ‘   0.0 in | 84%   │     ‘ ‘ ‘ ‘   0.0 in | 78%   │               0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Fri 27 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│  _`/"".-.     Patchy rain po…│  _`/"".-.     Patchy rain po…│    \  /       Partly cloudy  │               Cloudy         │
│   ,\_(   ).   +50(46) °F     │   ,\_(   ).   +50(46) °F     │  _ /"".-.     +48(44) °F     │      .--.     +46(42) °F     │
│    /(___(__)  ↗ 8-11 mph     │    /(___(__)  ↗ 9-12 mph     │    \_(   ).   ↗ 6-12 mph     │   .-(    ).   ↗ 8-14 mph     │
│      ‘ ‘ ‘ ‘  6 mi           │      ‘ ‘ ‘ ‘  6 mi           │    /(___(__)  6 mi           │  (___.__)__)  6 mi           │
│     ‘ ‘ ‘ ‘   0.0 in | 71%   │     ‘ ‘ ‘ ‘   0.0 in | 64%   │               0.0 in | 0%    │               0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Sat 28 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│               Cloudy         │  _`/"".-.     Patchy rain po…│               Cloudy         │    \  /       Partly cloudy  │
│      .--.     +48(44) °F     │   ,\_(   ).   +53(50) °F     │      .--.     +51(48) °F     │  _ /"".-.     +51(50) °F     │
│   .-(    ).   ↗ 9-14 mph     │    /(___(__)  ↑ 11-13 mph    │   .-(    ).   ↑ 8-13 mph     │    \_(   ).   ↖ 8-14 mph     │
│  (___.__)__)  6 mi           │      ‘ ‘ ‘ ‘  6 mi           │  (___.__)__)  6 mi           │    /(___(__)  6 mi           │
│               0.0 in | 0%    │     ‘ ‘ ‘ ‘   0.0 in | 63%   │               0.0 in | 0%    │               0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
Location: Rhein, Rhein-Hunsrück-Kreis, Rheinland-Pfalz, Deutschland [50.2369737,7.5801837]

(workshoptools) [hvasquezgross@cpu-53 training]$ curl wttr.in/Reno+NV
Weather report: Reno+NV

       .-.      Light rain
      (   ).    +46(42) °F     
     (___(__)   ↗ 8 mph        
      ‘ ‘ ‘ ‘   9 mi           
     ‘ ‘ ‘ ‘    0.0 in         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Wed 25 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │               Overcast       │     \   /     Clear          │
│      .-.      +53(48) °F     │      .-.      +57(51) °F     │      .--.     +44(35) °F     │      .-.      +37(30) °F     │
│   ― (   ) ―   ↗ 14-16 mph    │   ― (   ) ―   ↗ 16-19 mph    │   .-(    ).   ↗ 8-9 mph      │   ― (   ) ―   ↗ 10-13 mph    │
│      `-’      6 mi           │      `-’      6 mi           │  (___.__)__)  6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │               0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Thu 26 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      +35(32) °F     │      .-.      +46(44) °F     │      .-.      +44(41) °F     │      .-.      +35(33) °F     │
│   ― (   ) ―   ↓ 3-4 mph      │   ― (   ) ―   ↙ 3-4 mph      │   ― (   ) ―   ↙ 4-6 mph      │   ― (   ) ―   ← 3-8 mph      │
│      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Fri 27 Oct ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      41 °F          │      .-.      50 °F          │      .-.      +48(46) °F     │      .-.      +41(39) °F     │
│   ― (   ) ―   → 1 mph        │   ― (   ) ―   ↓ 1 mph        │   ― (   ) ―   ↘ 4-8 mph      │   ― (   ) ―   → 2-4 mph      │
│      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │      `-’      6 mi           │
│     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │     /   \     0.0 in | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
Location: Reno, Washoe County, Nevada, United States of America [39.52927,-119.8136744]

These results correctly found Reno, NV.

However, this data is very pretty in it’s presentation, but it is not easily machine readable. The github documentation mentions how to get the data in JSON format. Let’s try to download the data for Reno.

(workshoptools) [hvasquezgross@cpu-53 training]$ curl wttr.in/Reno+NV?format=j1
{
    "current_condition": [
        {
            "FeelsLikeC": "7",
            "FeelsLikeF": "44",
            "cloudcover": "100",
            "humidity": "68",
            "localObsDateTime": "2023-10-25 03:53 PM",
            "observation_time": "10:53 PM",
            "precipInches": "0.0",
            "precipMM": "0.0",
            "pressure": "1014",
            "pressureInches": "30",
            "temp_C": "9",
            "temp_F": "48",
            "uvIndex": "3",
            "visibility": "16",
            "visibilityMiles": "9",
            "weatherCode": "296",
            "weatherDesc": [
                {
                    "value": "Light rain"
                }
            ],
            "weatherIconUrl": [
                {
                    "value": ""
                }
            ],
            "winddir16Point": "W",
            "winddirDegree": "260",
            "windspeedKmph": "15",
            "windspeedMiles": "9"
        }
    ],
    "nearest_area": [
        {
            "areaName": [
                {
                    "value": "Reno"
                }
            ],
[edit]

That’s a lot of text on the screen, but we did not actually save the data.

Let’s use redirection to save this into a file named reno.json. Then use less to view the contents

(workshoptools) [hvasquezgross@cpu-53 training]$ curl wttr.in/Reno+NV?format=j1  > reno.json
(workshoptools) [hvasquezgross@cpu-53 training]$ less reno.json

Now, we are going to use the jq tool to filter the data in the reno.json file for the hourly temperature. Covering how to use jq is outside the scope of this workshop. This is mostly just to simulate working with data in your research domain.


(workshoptools) [hvasquezgross@cpu-53 training]$  jq ".weather[].hourly[].tempF" reno.json 
"48"
"44"
"44"
"54"
"57"
"48"
"44"
"38"
"38"
"40"
"37"
"35"
"46"
"52"
"44"
"36"
"33"
"31"
"30"
"41"
"50"
"56"
"48"
"41"

Let’s save the results as reno_tempf.tsv


(workshoptools) [hvasquezgross@cpu-53 training]$  jq ".weather[].hourly[].tempF" reno.json > reno_tempf.tsv

Writing your first SBATCH script

Performing this analysis runs quickly. However, you can imagine if you had a larger dataset or the programs you are running take much longer to complete, this would be time consuming to wait after each command to run the next command. Instead, we should write an SBATCH script to submit this to the SLURM job scheduler to run this analysis.

Writing a basic BASH/SBATCH script is doable for beginners. In fact, you have already essentially wrote a script by running the commands above!

If we go back through our history and copy/paste the commands in the same order, they should run in the same manner when submitting the job. Let’s try this now.


(workshoptools) [hvasquezgross@cpu-53 training]$ nano weather_analysis_sbatch.sh

Now, go back to your history and copy/paste the commands. HINT You may want to use screen to be able to keep your text document open while looking at your history.

Once completed, your SBATCH script should look similar to below. We will need to edit the –partition and –account parameters depending on what the cluster administrators have assigned. Look this up in your “Welcome to Pronghorn” email. Also, we added a timeout after the jq command to simulate a compuationally intensive operation, rather than it running in less than 1 second.


#!/bin/bash

#SBATCH --job-name=weather
#SBATCH --output=weather.log
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=10GB
#SBATCH --time=00:20:00
#SBATCH --partition=[your partition]
#SBATCH --account=[your account]
#SBATCH --mail-user=[yourEMAIL]
#SBATCH --mail-type=ALL



curl wttr.in/Reno+NV?format=j1  > reno.json

jq ".weather[].hourly[].tempF" reno.json > reno_tempf.tsv

timeout 30s cat /dev/zero > /dev/null

###

Now, let’s submit the job to the cluster.


(genometools) [hvasquezgross@cpu-53 training]$ sbatch weather_analysis_sbatch.sh

After submitting the job, monitor the running job by using the squeue command.

Transferring files from Pronghorn to your local computer

If you are on Linux or Apple OSX or Windows with WSL, you can use the terminal commands scp or sftp to download the data.

If you would like a graphical user interface, UNR’s OIT suggests WinSCP or Cyberduck.

During the workshop, we will go over how to configure and use this program to transfer files.

Then, we will review the reno.json and reno_tempf.tsv files

Advanced: Processing Multiple samples with an array SBATCH script

Let’s make a new directory for analysis in our training folder name “array_sbatch”.

(workshoptools) [hvasquezgross@login-1 training]$ cd  ~/training/
(workshoptools) [hvasquezgross@login-1 training]$ mkdir array_sbatch
(workshoptools) [hvasquezgross@login-1 training]$ cd array_sbatch/

Lets make a script named sbatch_array.sh with the following contents. We will need to edit the –partition and –account parameters depending on what the cluster administrators have assigned. Look this up in your “Welcome to Pronghorn” email.


#!/bin/bash
#SBATCH --time=0-1 # days-hours
#SBATCH --job-name=training  # Job name
#SBATCH --array=1-12
#SBATCH --nodes=1
#SBATCH --ntasks=2 # Number of cores
#SBATCH --mem=8000 # Memory pool for all cores (see also --mem-per-cpu)
#SBATCH --partition=[YOUR_PARTITION_HERE]
#SBATCH --account=[YOUR_ACCOUNT_HERE]
#SBATCH --output=training-%A-%a.out # File to which STDOUT will be written
#SBATCH --error=training-%A-%a.err # File to which STDERR will be written
#SBATCH --mail-type=END # Type of email notification- BEGIN,END,FAIL,ALL
# #SBATCH --mail-user=youremail@unr.edu # Email to which notifications will be sent

## Record the start time
start=`date +%s`

## Record the host being run on
echo "Hostname: $(eval hostname)"


THREADS=${SLURM_NTASKS}
MEM=$(expr ${SLURM_MEM_PER_NODE} / 1024)

echo "Allocated threads: " $THREADS
echo "Allocated memory: " $MEM


##Load conda environment
source ~/.bashrc
source activate workshoptools


## provide the script the row # of the sample to be run
cityname=`sed "${SLURM_ARRAY_TASK_ID}q;d" locations.txt`

CMD="curl wttr.in/$cityname?format=j1  > $cityname.json"
echo $CMD
eval $CMD

CMD="jq \".weather[].hourly[].tempF\" $cityname.json > ${cityname}_tempf.tsv"
echo $CMD
eval $CMD

CMD="timeout 30s cat /dev/zero > /dev/null"
echo $CMD
eval $CMD

## Record the start time, and output runtime
end=`date +%s`
runtime=$((end-start))
echo $runtime

We will discuss the script in detail. The key elements are the –array SBATCH option towards the top of the script. It is given the options 1-12. Then you will see the variable “cityname” gets assigned from a sed operation depending on a SLURM_ARRAY_TASK_ID from the given file “locations.txt”. In locations.txt, we will add 12 city names for the array job to process.


(workshoptools) [hvasquezgross@login-1 array_sbatch]$ nano locations.txt

Detroit
Paris
Hamburg
Orlando
Bonn
Nashville
Boise
Cairo
Shenzhen
Rotterdam
Antwerp
Valencia

Now lets run the array job


(workshoptools) [hvasquezgross@login-1 array_sbatch]$ sbatch sbatch_array.sh

You will notice the .err and .out logs will have file names associated with the main ARRAY job number, as well as the, array sub-job number.

Singularity/Docker containers: RStudio Server Use Case

WINDOWS USERS NOTE: I was not able to get PUTTY working for ssh tunnels. Please use WSL to be able to successfully tunnel. WSL Installation Guide: https://learn.microsoft.com/en-us/windows/wsl/install

Sometimes programs are much harder to install and there is not a conda recipe for the installation. In these cases, developers usually create a Docker or Singularity container for the software. These are self-contained environments with all necessary dependencies in order to run the software.

Conda should work for ~95% of software you want to setup. For the other 5%, you can use singularity to setup singularity or docker containers.

Let’s use singularity to pull a docker image of “R Studio server”, and try running it


(base) [hvasquezgross@login-1 training]$ mkdir singularity
(base) [hvasquezgross@login-1 training]$ cd singularity
(base) [hvasquezgross@login-1 singularity]$ singularity pull docker://rocker/rstudio:4.2
INFO:    Using cached SIF image
(base) [hvasquezgross@login-1 singularity]$ ls
rstudio_4.2.sif

You will now see rstudio_4.2.sif which is the singularity image.

Now lets setup directories to bind to the singularity container.


(base) [hvasquezgross@login-1 singularity]$ mkdir -p run var-lib-rstudio-server
(base) [hvasquezgross@login-1 singularity]$ printf 'provider=sqlite\ndirectory=/var/lib/rstudio-server\n' > database.conf
(base) [hvasquezgross@login-1 singularity]$ ls
database.conf  rstudio_4.2.sif  run  var-lib-rstudio-server

Next, we have to create a custom port to bind RStudio Server to. The default port is 8787. However, there can only be one service per computer/node running on this port without causing issues.

Therefore, we will need to assign ports for each user.

We will go around the room numbering off starting at 8701. Be sure to replace YOURPORTNUMBERHERE below with the correct port number. The following will assume port 8787 was used.

(base) [hvasquezgross@login-1 singularity]$ printf 'www-port=YOURPORTNUMBERHERE\nwww-address=127.0.0.1\n' > rserver.conf

We have to setup a password to authenticate to our personal RStudio server on pronghorn. Then we can run the singularity container.

Be sure to run this within a SCREEN/TMUX session. After you run this command, the command line wont return and it will appear to hang. This is normal.


PASSWORD='yourpassword' singularity exec \
   --bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,rserver.conf:/etc/rstudio/rserver.conf  \
   rstudio_4.2.sif \
   /usr/lib/rstudio-server/bin/rserver --auth-none=0 --auth-pam-helper-path=pam-helper --server-user=$(whoami)

We have R Studio server running on Pronghorn now! However, we cannot access it because pronghorn has a firewall rule blocking port 8787, which is the port R Studio is listening to accept connections.

We will need to create an SSH tunnel from our local workstation/laptop to Pronghorn in order to authenticate. This will act as if we are connected locally, but we are actually connecting to the remote server.

You will notice 8787:localhost:8787 below. You will need to update this to YOUR PERSONAL PORT NUMBER. Again, after running this command, the commandline will appear to hang.


(base) hvasquezgross@jimkent:~$ ssh hvasquezgross@pronghorn.rc.unr.edu -N -L 8787:localhost:8787

Now, on your local workstation, open your internet browser and type in: http://localhost:8787/ or http://localhost:YOURPORTNUMBERHERE/

You should now be presented the GUI for R Studio server!!!

BUT!!!!!

Currently, we are using the resources on the login node! If we run intensive calculations, this can affect other users on the login node. It would be better to run this on a CPU node. However, since we have to SSH Tunnel to get access to this, this adds another layer of complexity.

Let’s first start by writing our SBATCH script. In the script below, we are using custom set VARIABLES, as well as VARIABLES set by SLURM and SINGULARITY. Additionally, we are using Python to check open ports/socket to bind to so that each user on a particular node is sure to be running on their own port/socket.

SBATCH Script basis (but heavily modified): https://www.hpc.iastate.edu/guides/containers/rstudio

(base) [hvasquezgross@login-1 singularity]$ nano sbatch_rstudio.sh

#!/bin/bash
##0 days 8 hour run time
#SBATCH --time=0-8
##be sure to change your account nad permission
#SBATCH --partition=[your partition]
#SBATCH --account=[your account]
##Resources use 2 CPUs and 20 Gigs
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=20G
##Log file names
#SBATCH --output=rstudio-server-%j.out
#SBATCH --error=rstudio-server-%j.err

# Set OMP_NUM_THREADS to prevent OpenBLAS (and any other OpenMP-enhanced
# libraries used by R) from spawning more threads than the number of processors
# allocated to the job.
#
# Set R_LIBS_USER to a path specific to rocker/rstudio to avoid conflicts with
# personal libraries from any R installation in the host environment



cat > rsession.sh <<END
#!/bin/sh
export OMP_NUM_THREADS=${SLURM_JOB_CPUS_PER_NODE}
export R_LIBS_USER=${HOME}/R/rocker-rstudio/4.2
exec /usr/lib/rstudio-server/bin/rsession "\${@}"
END

chmod +x rsession.sh

mkdir -p ${HOME}/R/rocker-rstudio/4.2

export SINGULARITY_BIND="run:/run,database.conf:/etc/rstudio/database.conf,rsession.sh:/etc/rstudio/rsession.sh,var-lib-rstudio-server:/var/lib/rstudio-server"

# Do not suspend idle sessions.
# Alternative to setting session-timeout-minutes=0 in /etc/rstudio/rsession.conf
# https://github.com/rstudio/rstudio/blob/v1.4.1106/src/cpp/server/ServerSessionManager.cpp#L126
export SINGULARITYENV_RSTUDIO_SESSION_TIMEOUT=0

export SINGULARITYENV_USER=$(id -un)
export SINGULARITYENV_PASSWORD=$(openssl rand -base64 15)
# get unused socket per https://unix.stackexchange.com/a/132524
# tiny race condition between the python & singularity commands
readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')
cat 1>&2 <<END
1. SSH tunnel from your workstation using the following command:

   ssh -L ${PORT}:localhost:${PORT} ${SINGULARITYENV_USER}@pronghorn.rc.unr.edu ssh -L ${PORT}:localhost:${PORT} -N ${HOSTNAME}

   and point your web browser to http://localhost:${PORT}

2. log in to RStudio Server using the following credentials:

   user: ${SINGULARITYENV_USER}
   password: ${SINGULARITYENV_PASSWORD}

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f ${SLURM_JOB_ID}
END

CMD="singularity exec rstudio_4.2.sif /usr/lib/rstudio-server/bin/rserver --www-port ${PORT} --auth-none=0 --auth-pam-helper-path=pam-helper --server-user=${SINGULARITYENV_USER} --rsession-path=/etc/rstudio/rsession.sh 1>&2"
echo $CMD
eval $CMD

printf 'rserver exited' 1>&2

Then submit the script to the job scheduler and run squeue to find the job number.

(base) [hvasquezgross@login-1 singularity]$ sbatch sbatch_rstudio.sh
(base) [hvasquezgross@login-1 singularity]$ squeue -u $USER
           4762673 cpu-s1-in  rstudio hvasquez  R      15:12      1 cpu-54

Next, we need to create a double SSH tunnel from: Workstation -> Pronghorn Login Node -> Pronghorn CPU node

Our script we wrote automatically figures this connection information out and is printed in the .err file.


(base) [hvasquezgross@login-1 singularity]$ cat rstudio-server-4763527.err
1. SSH tunnel from your workstation using the following command:

   ssh -L 58741:localhost:58741 hvasquezgross@pronghorn.rc.unr.edu ssh -L 58741:localhost:58741 -N cpu-54

   and point your web browser to http://localhost:58741

2. log in to RStudio Server using the following credentials:

   user: hvasquezgross
   password: uDg0plt4Tvhs4RWrOVwk

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power") button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f 4763527

Then, we will connect using the same connection URL on our local workstation/laptop.

Success!! We have successfully ran Singularity RStudio container on a CPU node to perform computations!!

Please be aware depending on the SBATCH configuration for time, your job will continue to run through this time. If you are done with your analysis early, please use scancel command to cancel the job when resources are no longer needed.

Pronghorn Wrap-up

You should now know enough to: login to pronghorn, install programs, request computational resources, interactively test running programs, and write an SBATCH script to run full analyses through the SLURM job scheduler. If there are any questions/comments, please ask now or email me at hvasquezgross@unr.edu.