Caltech Resnick High Performance Computing Center
General
About
Mission
Research community
Getting help
Infrastructure
Sponsorship
Citing the Center
People
IMSS Leadership
IMSS Technical Leadership
IMSS Technical Staff
Contact
Services
Getting started
Performance & optimization
Software & environment support
Data transfer & storage
Interactive & visualization computing
Get in touch
Support
Getting Help
Self-service documentation
Consultations
System Status
Location
Rates
Cost Calculator
Compute Pricing
Compute Unit Definitions
Storage Pricing
Questions
Resources
Cluster Overview
Login Nodes
Standard Login
Visualization Login
Compute Nodes
GPU Nodes
Contact
Software
System Software
User Software
Pre-installed Software
Installation Methods
Module System
Spack
Anaconda/Conda
Singularity/Apptainer Containers
Software Guides
Abaqus
Known Issue
Solution
Example Job Script
Getting Help
MATLAB
Loading MATLAB
Running MATLAB on the Cluster
Parallel Computing
Tips
Jupyter Notebook
Launching Jupyter
Adding Conda Environments as Kernels
Accessing Group Directories
Julia Support
Configuration
Troubleshooting
cryoSPARC
Installation
Requesting Access
Connecting
Management Commands
Job Submission
RStudio
Launching RStudio
Package Installation
Example: Installing tidyverse
Troubleshooting
NVIDIA NGC
Account Setup
Configuration
Pulling Containers
Running Containers
NGC CLI Tool
Example SLURM Script
Popular NGC Containers
Relion using SBGrid
Setup
SBGrid Preferences
SLURM Submission Template
Verification
Troubleshooting
AlphaFold
Prerequisites
Setup
Job Submission
Output
Additional Options
Tips
VSCode
Prerequisites
Setup
First Connection
Advanced: Direct Compute Node Connection
Troubleshooting
Available Software Modules
Categories
Requesting New Software
See Also
Institutional Licenses
Request New Software
Open OnDemand
Features
Getting Started
Interactive Desktop
Troubleshooting
PATH Conflicts
Browser Issues
File Upload Failures
“Request Header Too Long” Error
Home Directory Quota
Support
System Status
Current Status
Active Announcements
Check the Cluster Yourself
Job queues
Node availability
Storage health
Job efficiency after a run
Notifications
Mailing List
Reporting an Issue
Training
Self-Study Resources
Internal
External (recommended)
Announcements
Citing the Resnick High Performance Computing Center
Recommended acknowledgement
BibTeX
CITATION.cff
Let us know when you publish
Questions
Policies
Acceptable use
Data handling
Export control
Security
Questions
Getting Started
Quick Start Guide
1. Get an account
2. Connect via SSH
Recommended SSH config
3. Move some data over
4. Find and load software
5. Submit your first job
6. Cheat sheet
7. Where to put your files
Next steps
Stuck?
Getting Started
Quick Links
First Steps
Logging In
Next Steps
Account Information
Getting an Account
For New Groups
For Existing Groups
Multi-Factor Authentication
Supported Methods
Eligibility Certification
HPC End-User Agreement
Running Jobs
SLURM Commands
Job Submission
sbatch
salloc
srun
Resource Request Parameters
Environment Variables
Queue Management
squeue
scancel
scontrol
Usage Reporting
sreport
sacct
Account Management
Task Launching
MPI Jobs
Example Batch Script
Fairshare & Job Priority
How fairshare works
Check your group’s shares and usage
See where your jobs sit in the queue
What raises your priority
Example Job Scripts
Basic Examples
Serial Job
Multi-threaded (OpenMP)
MPI (Multi-node)
GPU Jobs
Single GPU
Multi-GPU
Specific GPU Type
Python & Conda
Conda Environment
Jupyter Batch
Job Arrays
Parameter Sweep
Limit Concurrent Jobs
Applications
MATLAB
R
GROMACS
AlphaFold
Job Dependencies
Sequential Pipeline
Fan-out, Fan-in
Email Notifications
Generate a Custom Script
See Also
Best Practices
Resource Requests
Request What You Need
Match CPUs to Parallelism
Add Time Buffer
Job Arrays
Storage
Use the Right Location
Monitor Your Quota
Scratch Warning
I/O Performance
Code Efficiency
Profile First
Set Thread Counts
Checkpointing
Environment Management
Monitoring
Good Citizenship
Pre-Submit Checklist
Questions?
GPU Computing
Available GPUs
Requesting GPUs
Single GPU
Multiple GPUs
Specific GPU Type
Basic GPU Job
CUDA Programming
Load CUDA
Compile CUDA Code
Check GPU in Code
Deep Learning Frameworks
PyTorch
TensorFlow
JAX
Using Containers for Deep Learning
NGC Containers (Recommended)
Pull NGC Container
Multi-Node GPU Training
GPU Memory Management
Check Memory Usage
Reducing Memory Usage
Best Practices
Troubleshooting
“CUDA out of memory”
GPU not detected