Using R on HPC Clusters Part 1

Using R on HPC Clusters Part 1

OLCF

Speaker: George Ostrouchov


This OLCF hosted Webinar tutorial helps users learn a basic workflow for how to use R on an HPC cluster. The tutorial will focus on parallel computing as a means to speed up R scripts on a cluster computer. Many packages in R offer some form of parallel computing yet they rely on a much smaller set of underlying approaches: multithreading in compiled code, the unix fork, and MPI. The tutorial will take a narrow path to focus on packages that directly engage the underlying approaches, yet are easy to use at a high-level. This workshop is targeted for current users of OLCF, CADES, ALCF and NERSC.


Objectives


Learn a workflow to edit R code on your laptop and run it on an HPC cluster

Learn how to use multicore and distributed parallel concepts in R on an HPC cluster system


Day 1:


Hardware and software overview and ways to use multiple cores on a single node: using the mclapply function in the parallel package, using multithreaded BLAS


Day 2:

Distributed: Hardware review and using multiple nodes: MPI at high level via pbdMPI package, matrix methods via kazaam and pbdDMAT packages


Hands-on exercises workflow:


Edit your code in RStudio on your laptop -> push the code to GitHub/GitLab -> pull the code to the cluster and submit as batch -> look at your output and circle back to Edit.


This has the advantage of editing code in a familiar environment and running it in a common teaching environment. Other workflows are possible if you already know the tools.


We start with each user forking a GitHub exercise repository to own GitHub account and working with it as described above. See prerequisites that follow.

 

Git Repo: https://github.com/RBigData/R4HPC.git


Prerequisites:


This workshop is targeted for users of OLCF, CADES, ALCF and NERSC. Users who do not have accounts on systems at those centers will be able to participate in the lectures and laptop hands-on parts of the course but will not be able to do the hands-on parts on the HPC clusters during the course, though the course repo will be provided to all attendees. The workshop assumes that participants have done the following:


Have R installed on laptop

https://cran.rstudio.com/

Have RStudio Desktop installed on laptop

https://www.rstudio.com/products/rstudio/download/

Have git installed on laptop

https://happygitwithr.com/index.html

Are able to ssh to a remote machine

For Mac use Terminal

For Windows use Putty

See: https://github.com/olcf/foundational_hpc_skills/raw/master/intro_to_ssh/Intro_to_ssh_clients.pdf

Have worked with GitHub in RStudio. Have a GitHub account, know how to create or fork a repository and work with it from RStudio. Many tutorials are available on the web, for example: https://happygitwithr.com/index.html.

Know a few basic unix commands for listing files, creating a directory, removing files, etc. Lots of places to learn, for example Intro to Unix or Unix Shell Crash Course

Get started for free

    PricingContact salesWatch demos

24/7 customer support

Our customer support team is available to help 24/7. Enterprise members also receive dedicated account managers and a guaranteed uptime SLA.

© 2026 Vimeo.com, Inc. All rights reserved.

Terms
Privacy
Your Privacy Choices
U.S State Privacy
Copyright
Cookies